Internet DRAFT - draft-fan-nfsv4-global-namespace

draft-fan-nfsv4-global-namespace





INTERNET-DRAFT                                       C. Charles Fan
Expires: September 2005                                  Rainfinity

                                                        Dave Noveck
                                                  Network Appliance

                                                        Mario Wurzl
                                                                EMC

                                                         March 2005

                NFSv4 Global Namespace Problem Statement
                draft-fan-nfsv4-global-namespace-00.txt

Status of this Memo

     By submitting this Internet-Draft, I certify that any applicable
     patent or other IPR claims of which I am aware have been disclosed,
     or will be disclosed, and any of which I become aware will be
     disclosed, in accordance with RFC 3668.

     Internet-Drafts are working documents of the Internet Engineering
     Task Force (IETF), its areas, and its working groups.  Note that
     other groups may also distribute working documents as Internet-
     Drafts.

     Internet-Drafts are draft documents valid for a maximum of six
     months and may be updated, replaced, or obsoleted by other
     documents at any time.  It is inappropriate to use Internet-Drafts
     as reference material or to cite them other than as "work in
     progress."

     The list of current Internet-Drafts can be accessed at
     http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-
     Draft Shadow Directories can be accessed at
     http://www.ietf.org/shadow.html.

Copyright Notice

     Copyright (C) The Internet Society (2004).  All Rights Reserved.


Abstract

     NFS is one of the primary data access protocols for NAS, and
     naturally NFS users have been demanding a global namespace for NFS.
     This document intends to explain the rational for a global
     namespace, why it is an important feature for a network file system



Fan, Noveck, Wurzl       Expires September 2005                 [Page 1]





Internet Draft     Global Namespace Problem Statement         March 2005


     protocol, and the problems that a global namespace for files would
     solve.


Table of Contents

     1.    Introduction . . . . . . . . . . . . . . . . . . . . . . 2
     2.    The Applications . . . . . . . . . . . . . . . . . . . . 3
     3.    The Requirements . . . . . . . . . . . . . . . . . . . . 4
     4.    NFSv4 Global Namespace . . . . . . . . . . . . . . . . . 7
     5.    Suggested Approach . . . . . . . . . . . . . . . . . . . 7
           Acknowledgements . . . . . . . . . . . . . . . . . . . . 8
           Normative References . . . . . . . . . . . . . . . . . . 8
           Informative References . . . . . . . . . . . . . . . . . 8
           Author's Address . . . . . . . . . . . . . . . . . . . . 9
           Full Copyright Statement . . . . . . . . . . . . . . . . 9


1. Introduction

     During recent years, the range and complexity of Network Attached
     Storage (NAS) deployments has increased greatly.  Such trends as
     grid computing, virtualization, and information lifecycle
     management have helped to drive these wider, more complex
     deployments.  As a result, the need for tools to allow more
     manageable interaction with these more complex uses of NAS
     technology has brought forth a new set of requirements for what NAS
     users need to have in their environments.  A global namespace for
     files has emerged as one common requirement from NAS users.

     NFS is one of the primary data access protocols for NAS, and
     naturally NFS users have been demanding a global namespace for NFS.
     [Thurlow]  This document intends to explain the rationale for a
     global namespace, why it is an important feature for a network file
     system protocol, and the problems that a global namespace for files
     would solve.

     Fundamentally, how a network file-system object is named for access
     should be independent of where it is placed for storage.  A global
     namespace is a logical organization of files-system objects for
     user access.  Users expect it to be uniform, location independent,
     transparent to location changes, and is typically hierarchical.
     The creators and owners of file-system objects are the ones who
     decide how these objects should be named and where they should be
     found for access within the hierarchy of the global namespace.

     On the other hand, administrators are the ones to decide where the
     files should be physically stored.  In a large enterprise storage



Fan, Noveck, Wurzl       Expires September 2005                 [Page 2]





Internet Draft     Global Namespace Problem Statement         March 2005


     environment, there are a large number of NAS storage nodes, often
     distributed among multiple geographical sites.  The files may be
     distributed across the enterprise to satisfy a number of different
     goals including optimizing cost, optimizing performance and
     providing greater convenience of access and greater availability of
     the NAS data.

     A global namespace provides a unified view for data access,
     independent of where data is physically placed.  This namespace
     consists of mappings between names in the namespace and their
     corresponding physical locations, and acts as a road map to guide
     users to the locations of the data they seek.  This namespace is
     uniform and consistent for all users who access the data, and its
     presentation to the clients remains the same even when mappings
     change.


2. The Applications

     This section lists a few applications that would benefit from a
     global namespace.

2.1 NAS Storage Virtualization

     As NAS storage infrastructure scales up, each data site may house a
     large number of NAS storage units in a data center, and it is
     common to have multiple data centers in an enterprise.  It is
     desired that the data stored by these NAS units be managed as a
     single coherent space, rather than as a large number of disjoint
     data islands.

     A global namespace is an important enabler for NAS storage
     virtualization.  It provides a single-system view for data access
     into the enterprise-wide NAS storage.  A global namespace is
     desired to be re-configurable online: when the physical location of
     some data changes, the global namespace mapping is updated and
     users continue accessing the data without being interrupted.  With
     a global namespace in place, a virtualization solution can
     dynamically balance capacities among the physical NAS units behind
     the scenes, without affecting any of the user access.

2.2 Replication for Load Balancing & Grid Computing

     Storage grid is an integral part of any grid computing
     architecture.  A well-designed global namespace is a necessary
     component in creating a storage grid using NAS.  Multiple data
     replicas can be created, and the global namespace will be able to
     transparently guide clients to the appropriate replica based on



Fan, Noveck, Wurzl       Expires September 2005                 [Page 3]





Internet Draft     Global Namespace Problem Statement         March 2005


     load, distance, and access requirement (read or write).  This would
     effectively multiply the throughput capabilities of a storage grid,
     transparently to the users and applications.

2.3 Transparent Migration

     A similar benefit applies to migration.  A global namespace can
     keep the data presentation to the users and applications constant,
     while updating the mapping to the new location after the migration
     is complete.  This avoids the disruptive post-migration unmount and
     re-mount, while allowing the clients to find the right location for
     the requested data.

     Not only will this benefit one-time migration when users
     consolidate or upgrade storage equipment, but also make possible
     dynamic capacity balancing or hierarchical storage management
     applications that frequently move the data.

2.4 Transparent Fail-over

     With a global namespace, a high availability solution can handle
     the case where the backup node does not fully assume the identity
     of the primary node.  The global namespace gracefully points the
     clients to the new location of the data, and makes the fail-over a
     more transparent experience for the end-users and applications.

     This capability can be applied to online storage management as
     well.  Without a failure, global namespace facilities can be used
     to point the clients to another server, while this server is being
     serviced.  After the service is completed, the clients can then be
     redirected back.  The existence of a global namespace greatly helps
     reducing downtime during a normal storage management process.

2.5 Information Lifecycle Management

     As data go through different phases in their lifecycle, their
     storage requirements change.  An ILM solution optimizes the storage
     for the data according to the phase of their lifecycle.  A global
     namespace provides a way to guide clients to the appropriate
     physical locations of the data at the time, without the use of
     stubs on the file storage themselves.


3. The Requirements

     There are at least three different kinds of namespaces that have
     been referred to as global namespace for file storage:




Fan, Noveck, Wurzl       Expires September 2005                 [Page 4]





Internet Draft     Global Namespace Problem Statement         March 2005


     1    Intra-cluster namespace.  This is the unified namespace for a
          set of NAS servers in a tightly-coupled or aggregated cluster.
          People refer to it as "global" namespaces, as opposed to the
          "local" namespace of each node in the cluster.  Many propri-
          etary intra-cluster namespace schemes exist today as part of
          single-vendor solutions.

     2    Enterprise namespace.  This is the most requested form of
          "global namespace" by enterprise storage administrators.  An
          enterprise namespace provides a uniform view into network file
          storage for an entire enterprise.

     3    World-wide namespace.  This makes possible the "world-wide
          file storage", with a global URL to each file.  This could be
          achieved by an extension of the enterprise namespace scheme.

     This draft focuses on the enterprise namespace.  Enterprise file
     storage environment will continue to grow and continue to be het-
     erogeneous.  Standardization supports interoperability between dif-
     ferent vendors, and having a standards-based namespace solution for
     NFSv4 will help the wide adoption of the protocol.

     What are the requirements for an enterprise-wide namespace?  Here
     is a list of fundamental requirements:


     -    Location Independence: The namespace tree should be structured
          to reflect organizational or logical associations, independent
          of the physical location of the data.  This implies that there
          needs to be some sort of location table that serves to link
          the logical namespace and the physical locations.

     -    Uniformity of View: There should be a single location table of
          the namespace that all agree is authoritative.  This implies
          the existence of a root server and/or central repository for
          an enterprise domain, but does not imply that each client must
          mount into this unified namespace in the same way.

     -    Transparency: It is desired that when the physical location of
          the data changes due to administrative reasons, for example by
          migration or replication, the namespace presented to the
          client applications remain constant.  The update of the names-
          pace map entry can be achieved transparently to the clients.
          The client applications continue running, namespace remain
          constant, while the data is now from a different physical
          location.





Fan, Noveck, Wurzl       Expires September 2005                 [Page 5]





Internet Draft     Global Namespace Problem Statement         March 2005


     -    Security: The deployment of a namespace solution must not com-
          promise the security of data access.

     In addition to the above requirements, we must ask the following
     questions as well:


     -    Granularity of namespace mapping: Whether the namespace map-
          ping can happen at the file system granularity, or directory
          granularity, or file granularity, or sub-file granularity?

     -    Hierarchical Mapping: Is it possible for namespace entry /a/b
          to link to file server A, while /a/b/c to link to file server
          B?

     -    Variable Support: Depending on variables such as client OS,
          client geographical location, or time-of-day, can the names-
          pace mapping be different?  This is highly desired in many
          customer environments.

     -    Manageability.  Can the namespace be accessed and modified
          real-time by administrators? by applications? by user groups?
          How fast does a namespace mapping change propagate to all
          clients?

     -    Cycle Prevention.  Will the namespace tree be guaranteed to be
          acyclic?

     -    Multi-protocol Interoperability.  Will NFSv2 and v3 clients be
          able to use this same namespace?  Will this namespace be syn-
          chronized with the CIFS namespace?

     -    A viable global namespace solution will need to be location
          independent, unified, transparent and secure.  We should also
          consider additional user requirements to make sure we have a
          solution that addresses the needs of enterprise storage admin-
          istrators.


4. NFSv4 Global Namespace

     For NFS v2/v3 environments, the most popular namespace solution
     implemented is automounter daemon with automounter maps centrally
     managed at NIS server or LDAP server.  The popularity of this solu-
     tion shows that it addresses some of the namespace requirements
     outlined.  In particular, it supports the "location independence"
     requirement at export granularity, the "uniformity of view"
     requirement and the "security" requirement.  In addition, it



Fan, Noveck, Wurzl       Expires September 2005                 [Page 6]





Internet Draft     Global Namespace Problem Statement         March 2005


     supports hierarchical mapping and wildcard variables.  Because
     there is no server to server redirect, there is no cycle issues
     here either.

     So why do some NFS enterprise users still ask for a "global names-
     pace"?  What is lacking in an automounter-based solution?  First,
     the update of the automounter map is not completely transparent.
     Clients which have applications running and keeping the old mount
     active will not let go the old mount.  For some versions of some
     OS's, even after the mount become inactive, the old mount still
     won't be released.  Dealing with the varieties of client OS's and
     versions, this is a difficult problem to completely solve.

     Secondly the granularity of this solution is at the export level.
     For some of the above mentioned applications that require a global
     namespace, such as Load Balancing and ILM applications, finer gran-
     ularity (directory, file, and sub-file) is desired.

     In addition, some administrators have had experiences with the
     global namespace solution from other network file access protocols,
     such as CIFS and AFS.  CIFS includes specification of Dfs links
     that supports the deployment of Dfsroot namespace server.  AFS can
     dynamically map its volumes to different physical locations by the
     use of Volume Location Database (VLDB).  They desire comparable
     functionality be available in NFSv4.

     RFC3530 [RFC3530] specifies basic functionality useful for imple-
     menting an NFSv4 global namespace, either by using solely the
     facilities within RFC3530, or by augmenting them through features
     to be added in a minor version.


5. Suggested Approach

     First, we could choose a central repository, such as LDAP, for the
     namespace mappings.  We can work to define a standard schema for
     the NFS namespace mappings.  This work is not part of the NFSv4
     protocol itself, but it is reasonable for us to address it for the
     specific case of NFS namespace.  We should define well how servers
     and clients can access this central repository, and how it should
     support not only NFSv4, but v3 and v2 clients as well.

     Second, we need to clarify the client-server interactions based on
     RFC 3530.  This work is already under way, with both implementation
     and suggestions for errata for RFC 3530.  Both the migration case
     and the pure referral case need to be fully considered. [Noveck]
     The security issues should also be considered that the proposed
     scheme doesn't compromise existing level of security.  The hope is



Fan, Noveck, Wurzl       Expires September 2005                 [Page 7]





Internet Draft     Global Namespace Problem Statement         March 2005


     that this challenge will be overcome, and we'll be able to have the
     first client, server and namespace server reference implementation
     of the basic use of the NFS4ERR_MOVED and fs_location.

     Third, we should define a mechanism with which clients in the
     enterprise know where to find the root for the NFS enterprise
     namespace.  One simple solution is to leverage the DNS domain, and
     set up a convention that the DNS name nfsroot always corresponds to
     the root namespace server.  The root namespace server can refer
     clients to other namespace servers.  Schemes should be designed to
     enforce that the relationship between namespace servers is hierar-
     chical and not cyclical.  This scheme can be extended to support
     world-wide NFS namespace as well.

     Next, with NFSv4.x clients accessing the namespace through the
     namespace server via NFS protocol, it is then possible to enhance
     the protocol in the form of minor versions to support better trans-
     parency and finer granularity and better manageability.  Possible
     enhancements in 4.x that may worth some discussion include file-
     level referrals, lifetime on file handles, additional client-server
     exchange of variable values, etc.


Acknowledgements

     The authors would like to thank Andy Adamson, Ted Anderson and
     Robert Thurlow for their helpful comments on the draft.


Normative References

     [RFC3530]
          S. Shepler, et. al., "NFS Version 4 Protocol", Standards Track
          RFC

Informative References

     [Noveck]
          D. Noveck, C. Burnett, "Implementation Guide for Referrals in
          NFSv4", IETF Internet Draft, draft-noveck-nfsv4-refer-
          rals-00.txt

     [Thurlow]
          R. Thurlow, "A Namespace For NFS Version 4", IETF Internet
          Draft, draft-thurlow-nfsv4-namespace-00.txt


Author's Address



Fan, Noveck, Wurzl       Expires September 2005                 [Page 8]





Internet Draft     Global Namespace Problem Statement         March 2005


     C. Charles Fan
     Rainfinity
     2740 Zanker Road
     San Jose, CA 95134  USA

     Phone: +1 408 382 4755
     EMail: fan@rainfinity.com


Full Copyright Statement

     Copyright (C) The Internet Society (2004).  This document is sub-
     ject to the rights, licenses and restrictions contained in BCP 78
     and except as set forth therein, the authors retain all their
     rights.

     This document and the information contained herein are provided on
     an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REP-
     RESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
     INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
     IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
     THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
     WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Intellectual Property

     The IETF takes no position regarding the validity or scope of any
     Intellectual Property Rights or other rights that might be claimed
     to pertain to the implementation or use of the technology described
     in this document or the extent to which any license under such
     rights might or might not be available; nor does it represent that
     it has made any independent effort to identify any such rights.
     Information on the procedures with respect to rights in RFC docu-
     ments can be found in BCP 78 and BCP 79.

     Copies of IPR disclosures made to the IETF Secretariat and any
     assurances of licenses to be made available, or the result of an
     attempt made to obtain a general license or permission for the use
     of such proprietary rights by implementers or users of this speci-
     fication can be obtained from the IETF on-line IPR repository at
     http://www.ietf.org/ipr.

     The IETF invites any interested party to bring to its attention any
     copyrights, patents or patent applications, or other proprietary
     rights that may cover technology that may be required to implement
     this standard.  Please address the information to the IETF at ietf-
     ipr@ietf.org.



Fan, Noveck, Wurzl       Expires September 2005                 [Page 9]





Internet Draft     Global Namespace Problem Statement         March 2005


Acknowledgement

     Funding for the RFC Editor function is currently provided by the
     Internet Society.















































Fan, Noveck, Wurzl       Expires September 2005                [Page 10]