What’s this thing called CMIS? Part 3: Folders, Path, Versions
The third part of my series is about folders, paths and versions.
Older articles in this series are:
Part 1: Overview, domain model and bindings
Part 2: Domain model part 1, Repository, Types, and Properties
Folders are well known concept. They are present even in the simplest form of file systems used all over the place. Access by path is also a well known pattern from the files system or from the World Wide Web. Versions are typically not available in a file system but are available in most document management systems. They are essential for any kind of collaborative editing and they can be used to preserve the history of a document.
In almost any kind of structuring information you will try to define some kind of hierarchy. Folders are a very simple form for such a hierarchy and because of this they are extremely useful. In CMIS every repository can have a folder hierarchy. Each folder has at least one parent folder. There is one special folder without a parent called the root folder. The navigation service in CMIS is used to navigate along the folder hierarchy. There are methods to get the parent(s) for an object or to enumerate the children. Each folder has a unique id. The id of the root folder is part of the repository info. The details how repository implementations implement folders vary. Therefore the CMIS specification has some optional. One of this options allows a document being contained in more than one folder. This feature is called multi-filing and is only available for documents. A document therefore can have multiple parents. Folders however always have exactly one parent (except the root folder). So each folder appears only once in the hierarchy.
Another option is that a repository supports storing documents outside of the folder hierarchy so that it has no parent folder at all. This is called un-filing. Un-filing is often used for archiving scenarios where the main purpose of the repository is to preserve documents for a long time in a stable way. The repository provides an id and the id is stored in a leading application having control about the document and its context.
Please note that CMIS objects that are neither documents nor folders (like relationships or policies) are never contained a folders. This is part of the specification and independent of the un-filing capability.
The specification makes no assumptions about further constraints. So it is valid for a repository to allow folder having two or more objects with the same name. CMIS clients should be prepared for such a repository behavior.
A repository can support the constraint that a folder may contain only objects of certain types. A special property called AllowedChildObjectTypeIDs is used to control this behavior.
In many cases objects are identified by the path. The path points to the location in the folder hierarchy with a special character ‘/’ separating the folders. A path is another mechanism to access a document in addition to the object id. Paths are used in the World Wide Web (URL) or from file systems. CMIS supports retrieving objects by their path (getObjectByPath in the ObjectService) or by their id (getObject). Be aware that a path to a document does not have to be unique. With multi-filing a document can be in multiple folders and so have more than one path. For this reason there is also no method like getPath() that returns the path to an object. Instead you have to use the getObjectParents() method that returns relativePathSegment strings. A relative path segment is the part of the path to be used to address the object in this parent folder.
A CMIS client should not assume that the name of an object is part of its path. I f a repository supports multiple objects with the name in a folder this path would not be unique. Therefore use the pathSegment output of the getChildren() method in the Navigations Service instead. This is guaranteed to be unique. Because calculation can be expensive you explicitly have to ask for it (by setting the includePathSegment parameter to true).
Paths are not stable. If an object is moved from one folder to another it’s path will be changed. For this reason it is in general not a good idea to use path access for archiving scenarios. The id is the preferred choice in this case (however the behavior of repositories may vary regarding the stability of ids for example after updating an object).
A document in CMIS can exist in multiple versions. Only documents can be versioned, other objects like folders, relationships, etc. can’t be versioned. Not every document can be versioned. Whether versioning is supported or not is determined by the Document Type (versionable property). CMIS supports a simple linear versioning model. Versions can be major or minor. A version series indicates all versions that belong to one document. A version series has an id. The VersioningService is used to create and access versions. To create a new version in a version series the document needs to be checked out. After checking out you get a private working copy (PWC). A private working copy also has an id. Only one PWC can exist at any point in time for a version series. A PWC can be updated and edited by the user who owns the PWC. To create a new version the PWC has to be checked-in. The check-in operation indicates whether a major or minor version is created. The cancelCheckIn method discards the private working copy (and any change will get lost). Whether a PWC is visible to others and for example can be returned by a query is repository specific. It is also repository specific if other versions except the PWC can be edited or updated. The RepositoryInfo structure gives you certain information about this. The versionSeriesId is a property of a versioned document. Its presence indicates that multiple versions of this document exist. The CMIS specification makes no assumptions what other operations are allowed with a versionSeriesId as input parameter. This is repository specific. The specification also does not define the behavior of properties for versions. Each version has a set of associated properties; some of them are specific to the version (like the version number). A client should be prepared that some repositories may internally map certain properties to the version series and not the individual version. So if it is changed for one version it may change for all other versions as well. In a query a client can request if he wants to retrieve all versions of a document. A repository can support (see RepositoryInfo) that versions can be stored in different folders (version specific filing). A relation can be tied to a version or to a version series (repository specifc).
Filed under: CMIS, ECM | 3 Comments
Tags: CMIS ECM Repository