lots
Package: WA2L/edrc 1.5.57
Section: Maintenance Commands (1m)
Updated: 13 December 2014
Index
Return to Main Contents
NAME
lots - long term data save handling
SYNOPSIS
edrc/bin/lots
[
-h
]
lots
[
-c configfile
]
-a
(
collect
[
-d datalist
] |
save
|
execute
) [
-i identitylist
]
lots
[
-c configfile
]
-a
(
lock
|
purge
)
lots
[
-c configfile
]
-a clear -j jobname
lots
[
-c configfile
]
-a
(
list_action
|
list_session
|
list_collect
|
list_save
|
list_lock
|
list_purge
|
list_clear
)
[
-f from_date
][
-t to_date
]
lots
[
-c configfile
]
-a
list_jobs
[
-d datalist
]
lots
[
-c configfile
]
-a
(
list_datalist
|
list_schedule
|
list_volume
)
lots
[
-c configfile
]
-a
(
print_job
-j jobname
|
print_log
[
-f from_date
][
-t to_date
]
|
print_logtail
|
print_session
-s sessionname
)
AVAILABILITY
WA2L/edrc
DESCRIPTION
lots
is used to copy data in an organized and automated fashion to a long term storage
device. On this device the data is locked.
lots
locks the data by setting the access time of the saved data to the future point
in time (current day +
RETENTION
) and removes the write flag of all files and directories to be locked. If the long
term data storage is a file system on a NetAPP filer having a 'SnapLock Enterprise'
volume or a 'SnapLock Compliance' volume, the data cannot be removed from Unix until the
RETENTION
is expired. If
lots
is used on a normal file system not having this functionality, the root user can
delete the data, but it is still ensured, that non-root users cannot view, list or
even remove the saved data.
The source path where the data comes from, the permissions of the path and other
properties are saved and recorded to ensure a data restore even if the user base
has changed since the data save.
After the expiration of the
RETENTION
+
DATA_PURGE_LAG,
the
lots
command purges (=removes) the expired data and the disk space is freed up.
The following four steps are performed during the life cycle of a long term data
save:
- 1) collect
-
This step is executed once for each
DATALIST
whose schedule in
schedule.dat
matches for the current day and whose
HOSTNAME
setting matches to the host where the
lots
command is started.
A new job is created and the file and directory names to be saved
as defined in the
datalist.dat
file are collected and stored in the newly created job. Be aware, that the
lots
release that created a job is always saved in the job, too (
VERSION
property when printing the job).
This job will in the following steps be passes thru several phases
during the whole information life cycle. The job information can be
displayed using the '
lots -a print_job -j jobname
' command.
In this step no data is saved. When the
collect
action is completed, the job is in the
save
phase.
If multiple schedule definitions of a
DATALIST
match at a certain day, only the schedule definition having the highest
RETENTION
value will be executed, this to use the disk space economically. A scheduled
DATALIST
is only executed once a day unless the
RETENTION
in the
schedule.dat
is increased.
A job that is in the
save
phase
can be removed using the '
lots -a clear -j jobname
' command.
- 2) save
-
This step is executed for all jobs that are in the
save
phase based on the resolved hostname in the
collect
step.
In this step the files found in the
collect
step are copied to the long term data save location as defined
in the
volume.dat
file. The save can be performed with or without compression. The
compression method has to be defined in the
schedule.dat
file.
When the
save
action is completed, the job is in the
lock
phase.
A job that is in the
lock
phase
can be removed using the '
lots -a clear -j jobname
' command. All data that has been saved will also be removed from
the long term storage.
- 3) lock
-
This step is executed for all jobs that are in the
lock
phase, the
save
phase completed with a
SAVESTATE
of
OK
and the number of days as defined in the
DATA_LOCK_LAG
setting in the
lots.cfg
file are expired.
The lock delay enables the system administrative personnel to react
on unwanted conditions, as: inconsistency of saved data, errors in
definitions which cause too much data to be saved (=waste of disk space)
and other reasons.
The
DATA_LOCK_LAG
should be defined carefully and the implications have to be known by the
administrative personnel operating the long term data save setup.
If, for instance, the
DATA_LOCK_LAG
is set to 7, the administrator has after the
save
seven days time to react on a
malfunction and to correct it. But during this time it is also possible
to delete the saved data, what can be a risk, too.
When a job is in the
lock
phase, but the previous phase (
save
) did not complete with a
SAVESTATE
equal to
OK,
the
lock
action for the related job will be skipped and the
data will not be locked. This means, that the job will remain in the
lock
phase until it is cleared using the '
lots -a clear -j jobname
' command. This avoids the locking of saved data of unsuccessful
save
actions.
When the
lock
action is completed, the job is in the
purge
phase.
A job that is in the
purge
phase
cannot be removed any more using the '
lots -a clear -j jobname
' command, because on the long term storage the data is locked and
lots
must wait for the expiration of the lock to remove the data related
to the locked job.
- 4) purge
-
This step is executed for all jobs that are in the
purge
phase after
RETENTION
+
DATA_PURGE_LAG
is expired.
The
purge
action removes the data related to the job on the long term storage
and the disk space is therefore freed up.
When the
purge
action is completed, the job is in the
expired
phase and will remain there for ever for reference and compliance
proof purposes.
The
DATA_PURGE_LAG
has only to be defined greater then
0,
if it is experienced, that the system clocks of the server(s) accessing
the long term storage ( using
lots
) and the 'Compliance Clock' of the long term data
storage device are not completely in sync. The purge of the data is delayed by
the number of days defined in this setting. This to ensure smooth data
purging.
This four steps
(collect,
save,
lock,
purge)
can be executed separately using the commands:
-
1)
lots -a collect
2)
lots -a save
3)
lots -a lock
4)
lots -a purge
but the normal case is using the '
lots -a execute
' command which executes all steps in one session in sequence.
In a productive automated setup, the '
lots -a execute
' command will normally be called via a
cron
entry that starts
lots
once each day.
It is supported to access the same long term storage in parallel from
multiple hosts. In this case the
VARDIR
as defined in the
lots.cfg
file and the
lots.cfg
file itself should be located also centrally on the long term storage
to allow the most convenient operations. It is anyway recommendable to
put the
VARDIR
and the
lots.cfg
config file also on the long term storage device to have the
configuration and state information separated from the server to
always secure this information and to be independent from server
crashes and server re-installations which might occur during the
life cycle of the data stored for a long time on the long term data
storage. For central secure setup as suggested here, see the
EXAMPLES
section.
Each invocation of '
lots -a action
' creates a session. A session represents all command output
and can be displayed for verification and compliance proof.
Session logs are kept forever and can be displayed
using the '
lots -a print_session -s sessionname
' command. To evaluate the sessions related to a job, use the '
lots -a list_(action|collect|save|lock|purge|clear)
' command.
To resolve the data save location for a job of a certain
DATALIST,
use the commands:
-
1)
lots -a list_jobs
[
-d datalist
]
2)
lots -a print_job -j jobname
The
lots
command currently does not provide a data restore interface. Therefore
the restore of the saved files is performed using the normal operating
system commands (
cp,
cpio,
unzip,
bzip2 -d,
gzip -d,
uncompress
) and after the restore the permissions have to be adjusted based on the
information queried using the
print_job
action as described above also using the operating system commands (
chmod,
chown,
chgrp
).
OPTIONS
- -h
-
usage message. Here the revision of
lots
is also displayed.
- -c config_file
-
configuration file.
- -i identitylist
-
comma separated list of identities (=hostnames) of the
lots
command.
If this option is not specified and the environment variable
$LOTS_IDENTITIES
is not set, the identities of the
lots
command are all hostnames under which the host where
lots
is started is known. Using the
-i
option it can be defined on which
HOSTNAME
settings in the
schedule.dat
file the
lots
command reacts.
Example, when
-i adm_ora1_tst
is used:
:
resolve identities of this host ...
adm_ora1_tst
done.
:
Example, when the
-i
option is not used and the
$LOTS_IDENTITIES
environment variable is not set (the
adm_ora1_tst
hostname is a cluster package that is currently
running on the host):
:
resolve identities of this host ...
adm_ora1_tst acme001 loghost localhost
done.
:
- -a
-
action:
-
- collect [ -d datalist ] [ -i identitylist ]
-
collect data of scheduled datalist(s).
- save [ -i identitylist ]
-
save data that has been collected to the long term storage.
- lock
-
lock data on the long term storage from modification.
- purge
-
remove data whose locks (retention) have been expired from long
term storage.
- clear -j jobname
-
remove a job and the data saved (if any). It is only possible to
clear a job in the
save
and
lock
phases. When a job is cleared that has the biggest retention for a
datalist for the current date, it is possible to perform another
collect
action of the related schedule.
- execute [ -i identitylist ]
-
perform all four long term data save steps (
collect,
save,
lock,
purge
) in one step.
- list_action [ -f from_date ] [ -t to_date ]
-
list all performed actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the actions are listed from the
from_date
to the end.
If neither of the two dates are specified all actions are listed.
- list_session [ -f from_date ] [ -t to_date ]
-
list all saved sessions between the
from_date
and the
to_date.
If the
from_date
is not specified, the sessions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the sessions are listed from the
from_date
to the end.
If neither of the two dates are specified all sessions are listed.
- list_collect [ -f from_date ] [ -t to_date ]
-
list only the
collect
actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the
collect
actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the
collect
actions are listed from the
from_date
to the end.
If neither of the two dates are specified all
collect
actions are listed.
- list_save [ -f from_date ] [ -t to_date ]
-
list only the
save
actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the
save
actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the
save
actions are listed from the
from_date
to the end.
If neither of the two dates are specified all
save
actions are listed.
- list_lock [ -f from_date ] [ -t to_date ]
-
list only the
lock
actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the
lock
actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the
lock
actions are listed from the
from_date
to the end.
If neither of the two dates are specified all
lock
actions are listed.
- list_purge [ -f from_date ] [ -t to_date ]
-
list only the
purge
actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the
purge
actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the
purge
actions are listed from the
from_date
to the end.
If neither of the two dates are specified all
purge
actions are listed.
- list_clear [ -f from_date ] [ -t to_date ]
-
list only the
clear
actions between the
from_date
and the
to_date.
If the
from_date
is not specified, the
clear
actions are listed from the beginning to the
to_date.
If the
to_date
is not entered, the
clear
actions are listed from the
from_date
to the end.
If neither of the two dates are specified all
clear
actions are listed.
- list_jobs [ -d datalist ]
-
list the jobs and the phase where the job is in. If the
-d datalist
is not specified, all jobs are listed.
- list_datalist
-
list all valid datalist definitions. See also
datalist.dat(4)
for more information about the datalist format.
- list_schedule
-
list all valid schedule definitions. See also
schedule.dat(4)
for more information about the schedule format.
- list_volume
-
list all valid volume definitions. See also
volume.dat(4)
for more information about the volume format.
- print_job -j jobname
-
print all properties of a certain job. Use this
option to print the path of the saved data.
-
- GENERAL JOB PROPERTIES:
-
- PHASE
-
phase the job is currently in.
- HOSTNAME
-
hostname where the data of a matching schedule has been
collected and saved.
- VERSION
-
version of
lots
that created the related job.
- JOBNAME
-
name of the job. This name has to be specified in the
-j
option when required.
- DATALIST
-
scheduled datalist name of the job.
- DATALIST_DESCRIPTION
-
free text description of the
DATALIST
as defined in the
datalist.dat
file at the time of job creation.
- TIMEZONE
-
time zone as returned by
timezone(3)
at the time of job creation.
- SCHEDULE
-
schedule in
schedule.dat
which matched at the time of job creation.
- SCHEDULE_DESCRIPTION
-
free text description of the
SCHEDULE
as defined in the
schedule.dat
file at the time of job creation.
- RETENTION
-
effective data retention in days as defined in the
schedule.dat
file at the time of job creation.
- COMPRESSION
-
compression method of the saved data as defined in the
schedule.dat
file at the time of job creation.
- COLLECTTIMESTAMP
-
timestamp (seconds since the epoch) when the data to
be saved has been collected.
This property will be seen when the job is in the phase
save,
lock,
purge
or
expired.
- COLLECTTIMEDAT
-
this is the human readable format of
COLLECTTIMESTAMP.
- COLLECTSTATE
-
state of the
collect
action.
This property will be seen when the job is in the phase
save,
lock,
purge
or
expired.
- LOCKTIMESTAMP
-
timestamp (seconds since the epoch) when the data to
be saved will be locked.
This property will be seen when the job is in the phase
save,
lock,
purge
or
expired.
- LOCKTIMEDAT
-
this is the human readable format of
LOCKTIMESTAMP.
- LOCKSTATE
-
state of the
lock
action.
This property will be seen when the job is in the phase
purge
or
expired.
- SAVETIMESTAMP
-
timestamp (seconds since the epoch) when the data
has been saved.
This property will be seen when the job is in the phase
lock,
purge
or
expired.
- SAFETIMEDAT
-
this is the human readable format of
SAVETIMESTAMP.
- SAVESTATE
-
state of the
save
action.
This property will be seen when the job is in the phase
lock,
purge
or
expired.
- PURGETIMESTAMP
-
timestamp (seconds since the epoch) when the data
has been purged.
This property will be seen when the job is in the phase
expired.
If the
PURGESTATE
is
RETRY,
this property will also be seen when the job is in the phase
purge.
- PURGETIMEDAT
-
this is the human readable format of
PURGETIMESTAMP.
If the
PURGESTATE
is
RETRY,
this property will also be seen when the job is in the phase
purge.
- PURGESTATE
-
state of the
purge
action.
This property will normally be seen when the job is in the phase
expired.
If the
PURGESTATE
is
RETRY,
this property will also be seen when the job is in the phase
purge.
The
RETRY
PURGESTATE
shows, that the
purge
action was not completely successful. The purging of the data
of a job with this condition will be repeated in subsequent calls of
lots
until it succeeds.
- DATA SAVE INFORMATION:
-
- SAVE_BASEDIR
-
to this directory (on the long term storage) the data of the related
job is saved to. In normal cases only one directory is listed here,
but in special cases there might be listed multiple directories.
This property will only have content when the job in the phase
lock
or
purge.
- FILENAME
-
filename and path of the saved source file.
This property will only have content when data to be saved
as defined in
datalist.dat
has been found on the system when collecting the data.
- SAVE_SUFFIX
-
suffix of the saved file. This suffix correlates to the
chosen compression method as printed in the
COMPRESSION
property.
A file to be restored can therefore be accessed concatenating
the properties
SAVE_BASEDIR
+
FILENAME
+
SAVE_SUFFIX.
- SIZE
-
size of the source file in bytes.
- USER
-
file owner name of the source file. The numeric user ID is
stored with the file.
- GROUP
-
file group name of the source file. The numeric group ID is
stored with the file.
- PERMISSIONS
-
symbolic representation of the source file's permissions as
displayed by the
ls(1)
command.
- PERM
-
numeric representation of the source file's permissions which
can be used with the
chmod(1)
command.
- MTIME
-
modification time of the source file in military format.
- print_log [ -f from_date ] [ -t to_date ]
-
print the master log between the
from_date
and the
to_date.
If the
from_date
is not specified, the log file is printed from the
beginning to the
to_date.
If the
to_date
is not entered, the log file is printed from the
from_date
beginning to the end.
If neither of the two dates is specified the whole
log is printed.
- print_logtail
-
display a continuous master log output.
- print_session -s sessionname
-
print the session log output. Each output of
lots
is saved and due to compliance reasons never deleted. To evaluate the
session name related to a certain action or a sequence of actions, use
list_action,
list_collect,
list_save,
list_lock,
list_purge,
list_clear
or
print_log.
- -f from_date
-
begin date in the military format
YYYY-MM-DD.
To compute dates in this format, see:
input(3),
seconds(3),
timer(1),
today(3),
tomorrow(3),
yesterday(3).
- -t to_date
-
end date in the military format
YYYY-MM-DD.
To compute dates in this format, see:
input(3),
seconds(3),
timer(1),
today(3),
tomorrow(3),
yesterday(3).
- -j jobname
-
name of a
lots
job in the format
YYYY-MM-DD_hh.mm.ss.
- -s sessionname
-
name of a
lots
session in the format
YYYY-MM-DD_hh.mm.ss.
- -d datalist
-
name of a datalist as specified in
datalist.dat
and
schedule.dat.
ENVIRONMENT
- $LOTS_CONFIGFILE
-
configuration file of
lots.
The
-c configfile
command line option has preference. If the configuration
file specified in
$LOTS_CONFIGFILE
does not exist, the default configuration file
edrc/etc/lots.cfg
is read.
- $LOTS_IDENTITIES
-
comma separated list of identities (=hostnames) of the
lots
command. The
-i identitylist
option has preference.
EXIT STATUS
- 0
-
no error.
- 1
-
configuration file does not exist.
- 2
-
session could not be created. If you get this error, check
the file systems where the
VARDIR/log
directory resides. See also
lots.cfg(4).
- 4
-
Usage printed.
- 5
-
lots
aborted pressing <Ctrl>+<C>.
- 6
-
job as specified in
-j jobname
not found.
- 7
-
session as specified in
-s sessionname
not found.
- 8
-
cannot write to logfile.
- 9
-
cannot write to
VARDIR.
- 10
-
a job to be cleared does not exist or is not in the
save
or
lock
phases.
FILES
The
VARDIR
can be defined in the
lots.cfg
file. Default is
edrc/var/lots.
- edrc/etc/lots.cfg
-
default
lots
configuration file.
- VARDIR/log/lots.master.log
-
logfile of
lots.
Do not modify or shorten this file.
- VARDIR/log/lots.session.<SESSIONNAME>.log.gz
-
session logfile of
lots.
Display session logs using the '
lots -a print_session -s sessionname
' command. Do not modify or delete any of this files.
- VARDIR/locks/
-
lockfiles, do not edit them by hand.
- VARDIR/objects/datalist.dat
-
data save definition. In this file it is defined which set of
files and directories are saved using one handle (datalist).
- VARDIR/objects/schedule.dat
-
schedule, retention and compression definition. In this file
it is defined which datalist is scheduled to be saved on
which date.
- VARDIR/objects/volume.dat
-
definition of data save volume destinations. With this file
it is possible to locate long term data saves to different
destinations.
- VARDIR/spool/save/<JOBNAME>.job
-
jobs in the
save
phase.
Do not access job files directly, always use the
-a print_job
action to query job information, due to the fact that certain job
information is resolved by
lots
while printing the job and some job properties are constructed for
certain job VERSIONs due to functionality enhancements of
lots.
- VARDIR/spool/lock/<JOBNAME>.job
-
jobs in the
lock
phase.
Do not access job files directly, always use the
-a print_job
action to query job information, due to the fact that certain job
information is resolved by
lots
while printing the job and some job properties are constructed for
certain job VERSIONs due to functionality enhancements of
lots.
- VARDIR/spool/purge/<JOBNAME>.job
-
jobs in the
purge
phase.
Do not access job files directly, always use the
-a print_job
action to query job information, due to the fact that certain job
information is resolved by
lots
while printing the job and some job properties are constructed for
certain job VERSIONs due to functionality enhancements of
lots.
- VARDIR/spool/expired/<JOBNAME>.job
-
jobs in the
expired
phase.
Do not access job files directly, always use the
-a print_job
action to query job information, due to the fact that certain job
information is resolved by
lots
while printing the job and some job properties are constructed for
certain job VERSIONs due to functionality enhancements of
lots.
- VARDIR/state/action
-
record of all performed actions. Do not modify or shorten this file.
- VARDIR/state/diskusage
-
record of used disk space. This information is used to create reports.
- VARDIR/state/performance
-
record of durations and data throughput. This information is used to
calculate forecasts and create performance reports.
- <VOLUME_PATH>/<YEAR>/<MMDD>/<DATALIST>/<COUNTER>/data/
-
SAVE_BASEDIR
directory on the long term storage device where the data as
defined in a datalist is saved to.
- <VOLUME_PATH>/<YEAR>/<MMDD>/<DATALIST>/<COUNTER>/meta/<JOBNAME>.job
-
copy of the job file as located in
VARDIR/spool/lock/<JOBNAME>.job
or
VARDIR/spool/purge/<JOBNAME>.job
depending on the phase where the job currently is in.
On a severe emergency situation where the information in the
VARDIR/spool/lock/
and/or
VARDIR/spool/purge/
directories is lost without having a backup, the state of the
lots
jobs in the
lock
and
purge
phases, which are the most critical ones, can be reconstructed by
copying the job files from
<VOLUME_PATH>/<YEAR>/<MMDD>/<DATALIST>/<COUNTER>/meta/
back to the
VARDIR/spool/lock/
or
VARDIR/spool/purge/
directories. The meta data of saved data is also locked with the
identical retention as the data, therefore this information is
secured.
EXAMPLES
-
SEE ALSO
edrcintro(1),
lots.cfg(4),
datalist.dat(4),
schedule.dat(4),
volume.dat(4),
bzip2(1),
chmod(1),
chown(1),
compress(1),
cpio(1),
gzip(1),
input(3),
ls(1),
seconds(3),
timer(1),
timezone(3),
today(3),
tomorrow(3),
unzip(1),
yesterday(3),
zip(1).
NOTES
The NetAPP filers are able to provide the following SnapLock
variants, based on the licenses applied:
- SnapLock Enterprise:
-
Is a trusted administrator mode. In this
mode a volume containing non-expired locked data can be destroyed
by an NetAPP administrator on the NetAPP filer.
- SnapLock Compliance:
-
Is a untrusted administrator mode. In this mode a volume containing
non-expired locked data can not be destroyed on the NetAPP filer.
This results in the risk, that on program malfunction or handling
errors on the SnapLock volume, a big amount of disk space could be
wasted without the possibility to clean up the data and to free
up the wasted disk space.
BUGS
Abort handling is fully tested under HP-11ia only. Therefore when
running
lots
on other operating systems, refrain from aborting a running
lots
session if possible. However, the expected side effects are not
severe.
AUTHOR
lots was developed by Christian Walther. Send suggestions
and bug reports to wa2l@users.sourceforge.net .
COPYRIGHT
Copyright © 2014
Christian Walther
This is free software; see
edrc/doc/COPYING
for copying conditions. There is ABSOLUTELY NO WARRANTY; not
even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
This document was created by man2html
using the manual pages.
Time: 13:15:31 GMT, March 12, 2025