Internet DRAFT - draft-brasher-diap
draft-brasher-diap
Network Working Group D. Brasher
Internet-Draft Interlinux LTD
Intended status: Informational November 26, 2009
Expires: May 30, 2010
Distributed Internet Archive Protocol (DIAP)
draft-brasher-diap-11
Abstract
DIAP has been created to solve mid-range and below, long term
archiving requirements of the small medium enterprise. Where tape
has been deployed in the past, DIAP now offers an alternative
solution designed to be more robust and manageable in the long term
than network attached storage devices or simple disk storage alone.
The system provides a well defined structure for storing and managing
long term archives.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on May 30, 2010.
Copyright Notice
Copyright (c) 2009 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Brasher Expires May 30, 2010 [Page 1]
Internet-Draft Distributed Internet Archive Protocol November 2009
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document. Code Components extracted from this document must
include Simplified BSD License text as described in Section 4.e of
the Trust Legal Provisions and are provided without warranty as
described in the BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3
2. Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1. Disk storage structure . . . . . . . . . . . . . . . . . . 3
2.1.1. DIAP storage units . . . . . . . . . . . . . . . . . . 3
2.1.2. Replication . . . . . . . . . . . . . . . . . . . . . 4
2.1.3. Node roles . . . . . . . . . . . . . . . . . . . . . . 4
2.1.4. Archive date range . . . . . . . . . . . . . . . . . . 4
2.1.5. Storage structure tables . . . . . . . . . . . . . . . 4
2.2. Data transfer mechanisms . . . . . . . . . . . . . . . . . 6
2.2.1. Lowest maximum bandwidth (LMB) . . . . . . . . . . . . 7
2.2.2. Data transfer timing . . . . . . . . . . . . . . . . . 8
2.2.3. Phases . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2.4. Data flow, table . . . . . . . . . . . . . . . . . . . 8
2.2.5. Hyper virtual autochanger (HVA) . . . . . . . . . . . 10
2.2.6. Copy types . . . . . . . . . . . . . . . . . . . . . . 11
2.2.7. Leap years . . . . . . . . . . . . . . . . . . . . . . 11
2.2.8. Fill mechanism . . . . . . . . . . . . . . . . . . . . 11
3. Security considerations . . . . . . . . . . . . . . . . . . . 11
3.1. Passwords . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2. User space . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3. Application layer . . . . . . . . . . . . . . . . . . . . 12
3.4. Checksum . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5. Virtual private network . . . . . . . . . . . . . . . . . 12
3.6. Encrypted partitions, logical volumes and volumes . . . . 12
4. Community project and UK trademarks . . . . . . . . . . . . . 12
5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 12
6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 12
7. Change log . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8. Informative references . . . . . . . . . . . . . . . . . . . . 13
Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 14
Brasher Expires May 30, 2010 [Page 2]
Internet-Draft Distributed Internet Archive Protocol November 2009
1. Introduction
The architecture of DIAP has been designed to; maximise archive
storage capacity, availability, restoration and recovery speed,
scalability, modularity in code and storage resilience; to minimise
operating resource overheads, impact of network outages and
management overheads; to simplify the code development cycle,
deployment, data recovery and integration with existing systems.
DIAP architecture consists of two main parts: The disk storage
structure and the data transfer mechanisms. The data transfer
mechanisms in turn consists of four algorithms. An algorithm
responsible for collecting data and three that are collectively known
as the HVA (hyper virtual auto-changer). Data transfer calculations
are based on LMB (lowest maximum bandwidth) between storage nodes.
An DIAP node can be located on any computer, usually a server. Nodes
are named and allocated roles which are abstract and can be changed.
DIAP is designed to operate on three distinct operating system nodes
and at the same time allowing more than one instance of DIAP to exist
on each node. DIAP I/O operation is asynchronous (non-blocking) but
the architecture design co-ordinates and records transfer activity.
2. Architecture
2.1. Disk storage structure
2.1.1. DIAP storage units
DIAP storage units are approximately equivalent to the volume created
on tape.Most backup software products generate something equivalent
to the tape volume on disk. These disk volumes are organised to
minimise the duplication of data to conserve space into full,
differential and incremental types. For constant data streams, for
example CCTV footage, the volumes are all effectively type full.
DIAP is designed to store full and differential types. The disk
storage structure is designed to store a full once a month and its
corresponding differentials or a full followed by x number of months
of matching differential volumes. Some control is available when
choosing a name for newly generated volumes. If the type of data is
streamed and cannot be differentiated against a full then the option
to not collect full volumes is available. Then all volumes are full
but archived in the same way as differentials.
Brasher Expires May 30, 2010 [Page 3]
Internet-Draft Distributed Internet Archive Protocol November 2009
2.1.2. Replication
The disk structure ensures volumes are replicated across three
operating systems which can be located within IP sight of each other
over LAN, WLAN or WAN. This provides geographical resilience. The
structure on the disk is organised into slots which can be
directories.
2.1.3. Node roles
The nodes are are allocated one of three roles; A, B or C. The roles
a machines allocated can be changed in future. This allows machines
to be re-located. For each node year slots are created, then month
slots created and day slots including the special slots ad0 and
aFull$$, where $ is an integer between 0 and 2. Full01 will store a
Full volume at the beginning of the month and skip d1. Full02 is
there for additional redundancy and to cope with the scenario where
the current month is the last (this is not default behaviour). The
special slot d0 has more than one purpose and is described in more
detail in the section describing the data transfer mechanisms.
2.1.4. Archive date range
The start year and end year can be chosen according to storage
requirements. Retrospective archiving can be achieved by archiving
in the past by creating a DIAP disk structure from any chosen date in
the past. Retrospective archiving allows the user to transfer
archives from old tapes into DIAP. The number of years of archiving
required in the future is selectable.
2.1.5. Storage structure tables
These diagrams show the disk storage structure. The root of the
structure is a n*years then filled with months then days including
the special slots.
Brasher Expires May 30, 2010 [Page 4]
Internet-Draft Distributed Internet Archive Protocol November 2009
Slots for each year, extendable and pre 2009 is retrospective
+-------+--------+--------+--------+
| Slots | node A | node B | node C |
+-------+--------+--------+--------+
| | ... | ... | ... |
| | 2006 | 2006 | 2006 |
| | 2007 | 2007 | 2007 |
| | 2008 | 2008 | 2008 |
| | 2009 | 2009 | 2009 |
| | 2010 | 2010 | 2010 |
| | 2011 | 2011 | 2011 |
| | ... | ... | ... |
+-------+--------+--------+--------+
Table 1: Year slots
Slots for each month located in each year
+-------+--------+--------+--------+
| Slots | node A | node B | node C |
+-------+--------+--------+--------+
| | mth1 | mth1 | mth1 |
| | mth2 | mth2 | mth2 |
| | mth3 | mth3 | mth3 |
| | mth4 | mth4 | mth4 |
| | mth5 | mth5 | mth5 |
| | mth6 | mth6 | mth6 |
| | mth7 | mth7 | mth7 |
| | mth8 | mth8 | mth8 |
| | mth9 | mth9 | mth9 |
| | mth10 | mth10 | mth10 |
| | mth11 | mth11 | mth11 |
| | mth12 | mth12 | mth12 |
+-------+--------+--------+--------+
Table 2: Month slots
Brasher Expires May 30, 2010 [Page 5]
Internet-Draft Distributed Internet Archive Protocol November 2009
Slots in each day located in each month including special slots
+--------+--------+--------+--------+
| Slots | node A | node B | node C |
+--------+--------+--------+--------+
| (Dirs) | Full01 | Full01 | Full01 |
| | Full02 | Full02 | Full02 |
| | d0 | d0 | |
| | d01 | d01 | d01 |
| | d02 | d02 | d02 |
| | d03 | d03 | d03 |
| | d04 | d04 | d04 |
| | d05 | d05 | d05 |
| | d06 | d06 | d06 |
| | d07 | d07 | d07 |
| | d08 | d08 | d08 |
| | d09 | d09 | d09 |
| | d10 | d10 | d10 |
| | d11 | d11 | d11 |
| | d12 | d12 | d12 |
| | d13 | d13 | d13 |
| | d14 | d14 | d14 |
| | d15 | d15 | d15 |
| | d16 | d16 | d16 |
| | d17 | d17 | d17 |
| | d18 | d18 | d18 |
| | d19 | d19 | d19 |
| | d20 | d20 | d20 |
| | d21 | d21 | d21 |
| | d22 | d22 | d22 |
| | d23 | d23 | d23 |
| | d24 | d24 | d24 |
| | d25 | d25 | d25 |
| | d26 | d26 | d26 |
| | d27 | d27 | d27 |
| | d28 | d28 | d28 |
| | d29 | d29 | d29 |
| | d30 | d30 | d30 |
| | d31 | d31 | d31 |
+--------+--------+--------+--------+
Table 3: Day of month slots
2.2. Data transfer mechanisms
Brasher Expires May 30, 2010 [Page 6]
Internet-Draft Distributed Internet Archive Protocol November 2009
2.2.1. Lowest maximum bandwidth (LMB)
[Recalibrate this section to adjust for the most recent developments]
LMB = Lowest Maximum Bandwidth between any three nodes NB: actual max
transfer will vary so test transfers are recommended for accuracy.
LMB assumes all available bandwidth is allocated to running DIAP.
Max Full01 = LMB x 6 hrs. This assumes no transfer interruptions and
that the maximum bandwidth is constant.
Example calculations for a month of archiving using a single full
volume and a corresponding daily differentials.
Ave. diff = (Sum 29 (or a month) daily differentials) / 29.
Average differential is variable depending on your storage growth,
this represents a trend and can be an estimate to start with, but by
watching the trend of differential growth more accurate calculations
can be made. It is assumed your differentials are always smaller in
size than the initial full copy.
Min DIAP slot size (node a) = (Max Full01 x 2) + (29 x ave. diff) +
(1 x ave diff) plus 1 x ave diff to account for d0.
Min DIAP slot size (node b/c) = (Max Full01 x 2) + (29 x ave diff).
You can include transfer log files in the Min DIAP slot size, for
simplicity they have been omitted.
Example system
+----------------------+------------+------------------+------------+
| Example system | LMB x 6 | Ave diff | Max |
| | hrs | | aFull01 |
+----------------------+------------+------------------+------------+
| LMB occurs between | 1Mbit/Sec | Estimated 500 | 2.6 GiB |
| b->c | | MiB | |
+----------------------+------------+------------------+------------+
Table 4: Example system
Min DIAP slot size (node a) (2.6 x 2) + (29 x 0.5) + 0.5 = 20.2 GiB
Min DIAP slot size (node b-c) (2.6 x 2) + (29 x 0.5) = 19.7 GiB
If a copy fails then the system will retry the next day but you loose
the day of failure. Logging can be used to trace successful copies.
Brasher Expires May 30, 2010 [Page 7]
Internet-Draft Distributed Internet Archive Protocol November 2009
2.2.2. Data transfer timing
[Maybe re-write this section and incorporate into phases, or simplify
and keep both sections]
Two entry points, Full01 beginning of month and d0 for the remaining
days. Assuming entry points are filled during the day before the
cycle begins at night. Scheduled jobs split between 3 nodes, d0 is
cleared after copy to d$. The system reduces single point of failure
by creating a single full copy on each node at the beginning of the
month then at the end of the month to cover the next 30 day diap
cycle. The copies between a-a and a-b occur in the first three hours
then the copy from b-c happens after three hours. These times are
changed as required.
Nightly copies to and between nodes are made to new slots, if due to
some fail conditions a node is unavailable then the copy is not made,
however the next day when communication is restored copies continue
to the next nightly directory. This increases robustness over
previous layout as the next nightly copy is not dependent on the
success of the previous night's copy. Only two copies between nodes
are made between days 3-30, day 1 a single full and day 2 full and
day 30 does make an internal copy on all nodes.
2.2.3. Phases
To be written. Transfer from implementation.
2.2.4. Data flow, table
Flow of data (* for all in column)
+------------+----------------+----------------+----------------+
| Day - Time | A | B | |
| D1-T=0 | Full01-> | Full01 | |
| D2-T=0 | | | |
| D2-T=0 | d0->d1 *(a->a) | | |
| D2-T=0 | d0->d1 *(a->b) | | |
| D2-T=3 | | d1->d1 | |
| D3-T=0 | d0->d2 | | |
| D3-T=0 | d0->d2 | | |
| D3-T=3 | | d2->d2 | |
| D4-T=0 | d0->d3 | | |
| D4-T=0 | d0->d3 | | |
| D4-T=3 | | d3->d3 | |
| D5-T=0 | d0->d4 | | |
| D5-T=0 | d0->d4 | | |
| D5-T=3 | | d4->d4 | |
Brasher Expires May 30, 2010 [Page 8]
Internet-Draft Distributed Internet Archive Protocol November 2009
| D6-T=0 | d0->d5 | | |
| D6-T=0 | d0->d5 | | |
| D6-T=3 | | d5->d5 | |
| D7-T=0 | d0->d6 | | |
| D7-T=0 | d0->d6 | | |
| D7-T=3 | | d6->d6 | |
| D8-T=0 | d0->d7 | | |
| D8-T=0 | d0->d7 | | |
| D8-T=3 | | d7->d7 | |
| D9-T=0 | d0->d8 | | |
| D9-T=0 | d0->d8 | | |
| D9-T=3 | | d8->d8 | |
| D10-T=0 | d0->d9 | | |
| D10-T=0 | d0->d9 | | |
| D10-T=3 | | d9->d9 | |
| D11-T=0 | d0->d10 | | |
| D11-T=0 | d0->d10 | | |
| D11-T=3 | | d10-d10 | |
| D12-T=0 | d0->d11 | | |
| D12-T=0 | d0->d11 | | |
| D12-T=3 | | d11->d11 | |
| D13-T=0 | d0->d12 | | |
| D13-T=0 | d0->d12 | | |
| D13-T=3 | | d12->d12 | |
| D14-T=0 | d0->d13 | | |
| D14-T=0 | d0->d13 | | |
| D14-T=3 | | d13->d13 | |
| D15-T=0 | d0->d14 | | |
| D15-T=0 | d0->d14 | | |
| D15-T=3 | | d14->d14 | |
| D16-T=0 | d0->d15 | | |
| D16-T=0 | d0->d15 | | |
| D16-T=3 | | d15->d15 | |
| D17-T=0 | d0->d16 | | |
| D17-T=0 | d0->d16 | | |
| D17-T=3 | | d16->d16 | |
| D18-T=0 | d0->d17 | | |
| D18-T=0 | d0->d17 | | |
| D18-T=3 | | d17->d17 | |
| D19-T=0 | d0->d18 | | |
| D19-T=0 | d0->d18 | | |
| D19-T=3 | | d18->d18 | |
| D20-T=0 | d0->d19 | | |
| D20-T=0 | d0->d19 | | |
| D20-T=3 | | d19->d19 | |
| D21-T=0 | d0->d20 | | |
| D21-T=0 | d0->d20 | | |
| D21-T=3 | | d20->d20 | |
Brasher Expires May 30, 2010 [Page 9]
Internet-Draft Distributed Internet Archive Protocol November 2009
| D22-T=0 | d0->d21 | | |
| D22-T=0 | d0->d21 | | |
| D22-T=3 | | d21->d21 | |
| D23-T=0 | d0->d22 | | |
| D23-T=0 | d0->d22 | | |
| D23-T=3 | | d22->d22 | |
| D24-T=0 | d0->d23 | | |
| D24-T=0 | d0->d23 | | |
| D24-T=3 | | d23->d23 | |
| D25-T=0 | d0->d24 | | |
| D25-T=0 | d0->d24 | | |
| D25-T=3 | | d24->d24 | |
| D26-T=0 | d0->d25 | | |
| D26-T=0 | d0->d25 | | |
| D26-T=3 | | d25->d25 | |
| D27-T=0 | d0->d26 | | |
| D27-T=0 | d0->d26 | | |
| D27-T=3 | | d26->d26 | |
| D28-T=0 | d0->d27 | | |
| D28-T=0 | d0->d27 | | |
| D28-T=3 | | d27->d27 | |
| D29-T=0 | d0->d28 | | |
| D29-T=0 | d0->d28 | | |
| D29-T=3 | | d28->d28 | |
| D30-T=0 | Full01->Full02 | | |
| D30-T=0 | | Full01->Full02 | Node C |
| D30-T=0 | | | Full01->Full02 |
| D30-T=0 | d0->d29 | | |
| D30-T=0 | d0->d29 | | |
| D30-T=3 | | d29->d29 | |
+------------+----------------+----------------+----------------+
Table 5: Data flow
Start 00:00 - End 00:06 - T = 0(00:00) - T=3(03:00)
All copies from a-a are not used in bandwidth calculations.
2.2.5. Hyper virtual autochanger (HVA)
HVA is the term used to collectively describe the three algorithms
that work together but operate independently on each node to ensure
the data transfers occur as describe in the previous two sections.
This term is derived from the term virtual auto changer. A virtual
auto changer still requires hardware tape drives, 'Hyper' takes this
one stage further by emulating the virtual auto changer in software.
Brasher Expires May 30, 2010 [Page 10]
Internet-Draft Distributed Internet Archive Protocol November 2009
2.2.6. Copy types
Transfer from implementation.
2.2.7. Leap years
Transfer from implementation.
2.2.8. Fill mechanism
The filling mechanism works as follows: The start time is an integer
between 0 and 11. The fill is triggered by a scheduling application
like cron. Then a check is made to see if the previous days copy was
successfully. If not then an alert is made and logged for later use.
If yes then a search, using a pre-defined string, is made in a
directory containing the backup volumes. If full volumes have been
selected for collection then a check for the day of month is made.
[Currently in implementation this is day 2 - this will be settable to
any day]. The name of the full volume is a pre-defined string. If a
full needs to be transferred then the most recent full volume is
located. A check is made to see if the full has been collected
before is made. If no then the full is copied to the appropriate
slot and a date and sha1sum is created and located in the slot with
the volume. [see section; Security considerations, checksum for more
detail]. The activity is logged and the algorithm ends. If yes then
the activity is logged and the algorithm stops. If a full volume is
not required to be transferred then the most recent differential
volume is located using a pre-defined string. The contents of d0 are
cleared. A check is made to verify the differential has not been
collected before. If yes then the activity is logged and the
algorithm ends. If not the the latest differential is copied to the
appropriate slot and a date and sha1sum created and located in the
slot with the volume. Activity is logged and the algorithm ends.
3. Security considerations
In implementation it is recommended these security precautions are
followed.
3.1. Passwords
Do not store any passwords on file. Passwords should be stored in
memory temporarily. When a password is requested the entry view is
hidden. New account passwords are quality checked and a warning
given if not secure.
Brasher Expires May 30, 2010 [Page 11]
Internet-Draft Distributed Internet Archive Protocol November 2009
3.2. User space
Implementation to operate in user space to reduce risk of system
compromise and the effect of system compromises.
3.3. Application layer
Handle network communications with OpenSSH. [1] Generate unique RSA
or better certificates. Pass-wordless certificates should be re-
generated often. Rsync uses OpenSSH to transfer data. Use different
port to the standard SSH port 22 and individually set these for each
node. RFC4251 [RFC4251]
3.4. Checksum
Use sha1 checksum RFC3174 [RFC3174] and date stamp volumes as they
enter DIAP. This information can be used to validate the integrity
of stored archives in the future.
3.5. Virtual private network
Use a virtual private network between nodes for an additional layer
of security.
3.6. Encrypted partitions, logical volumes and volumes
Use encrypted partitions or logical volumes to enhance physical
security. Use encrypted archive volumes. The encryption may be
applied by backup software initial responsible for generating the
volumes.
4. Community project and UK trademarks
A community software implementation resides at DIASER (R) [2] A UK
trademark exists for DIAP and DIASER to protect the acronyms for Open
Source community development.
5. Conclusion
To be written.
6. Acknowledgements
Thanks are due to my wife Marisa and Myles McClelland and a number of
individuals from various groups. Also Stephen Pelc of MPE Forth [3]
Brasher Expires May 30, 2010 [Page 12]
Internet-Draft Distributed Internet Archive Protocol November 2009
for SME deployment context and advice and IPR consultancy. JISC [4]
for providing technical development funding through OMII-UK [5] and
ECS [6] (Southampton University) in collaboration with Interlinux
Ltd. [7]
7. Change log
26 Nov 09 - Error corrections and additions.
25 Nov 09 - Complete document overhaul incorporating six months
technical development.
09 Nov 09 - Change log correction.
09 Nov 09 - Fill algorithm described.
27 Jul 09 - Spell check.
27 Jul 09 - Remove section to avoid IP infringement.
15 Apr 09 - Extended data retention.
03 Dec 08 - Corrected Architecture.
02 Dec 08 - Refined Architecture.
06 July 08 - Architecture - arithmetic. Acknowledgements.
16 May 08 - Address.
8. Informative references
[RFC4251] Ylonen, T., "The Secure Shell (SSH) Protocol
Architecture", RFC 4251, January 2006.
[RFC3174] Eastlake, D., "US Secure Hash Algorithm 1 (SHA1)",
RFC 3174, September 2001.
[DIAP] Brasher, D., "Distributed Internet Archive Protocol
(DIAP)", Nov 2009, <http://www.diap.org.uk>.
[DIASER] Brasher, D., "Distributed Internet Archiving for
Educational Repositories (DIASER)", April 2009,
<http://www.diaser.org.uk>.
[DIASER manual]
Brasher Expires May 30, 2010 [Page 13]
Internet-Draft Distributed Internet Archive Protocol November 2009
Brasher, D., "DIASER manual", November 2009,
<http://www.diaser.org.uk/manual.html>.
[1] <http://www.openssh.com/>
[2] <http://www.diaser.org.uk>
[3] <http://www.mpeforth.com>
[4] <http://www.jisc.ac.uk>
[5] <http://www.omii.ac.uk>
[6] <http://www.ecs.soton.ac.uk>
[7] <http://www.interlinux.co.uk>
Author's Address
Damian Brasher
Interlinux LTD
PO Box 1623
Southampton, Hampshire SO15 9AE
United Kingdom
Email: dbrasher@interlinux.co.uk
Brasher Expires May 30, 2010 [Page 14]