Database presenting data in XML formats From Wikipedia, the free encyclopedia
An XML database is a data persistence software system that allows data to be specified, and stored, in XML format. This data can be queried, transformed, exported and returned to a calling system. XML databases are a flavor of document-oriented databases which are in turn a category of NoSQL database.
This article needs to be updated. (March 2015) |
Reasons to store data in XML format as an XML database include:[1] [2]
Steve O'Connell gives one reason for the use of XML in databases: the increasingly common use of XML for data transport, which has meant that "data is extracted from databases and put into XML documents and vice-versa".[4][needs update] It may prove more efficient in terms of conversion costs,[citation needed] and easier to store the data in XML format. In content-based applications, the ability of the native XML database also minimizes the need for extraction or entry of metadata to support searching and navigation.
XML-enabled databases typically offer one or more of the following approaches to storing XML within the traditional relational structure:
RDBMS that support the ISO XML Type are:
Typically an XML-enabled database is best suited where the majority of data are non-XML. For datasets where the majority of data are XML, a native XML database is better suited.
select
id, vol, xmlquery('$j/name', passing journal as "j") as name
from
journals
where
xmlexists('$j[licence="CreativeCommons"]', passing journal as "j")
XML databases are often used in combination with relational databases to manage and store hierarchical data. A significant challenge in such integrations is extracting XML documents from relational databases, which requires specialized techniques and tools. These techniques often include:
One of the most common scenario involves converting relational data into XML documents[11] to facilitate standards with systems relying on XML-based standards, such as web services or APIs. This process is important in applications where structured and semi-structured data co-exist and must be integrated perfectly.
For example, extracting hierarchical data from relational databases and converting it into XML is a common approach when generating XML feeds, exchanging data between systems, or implementing XML-based configurations.
Native XML databases are especially tailored for working with XML data. As managing XML as large strings would be inefficient, and due to the hierarchical nature of XML, custom optimized data structures are used for storage and querying. This usually increases performance both in terms of read-only queries and updates.[12] XML nodes and documents are the fundamental unit of (logical) storage, just as a relational database has fields and rows.
The standard for querying XML data per W3C recommendation is XQuery; the latest version is XQuery 3.1.[13] XQuery includes XPath as a sub-language and XML itself is a valid sub-syntax of XQuery. In addition to XPath, some XML databases support XSLT as a method of transforming documents or query results retrieved from the database.
Name | License | Native Language | XQuery 3.1 | XQuery 3.0 | XQuery 1.0 | XQuery Update | XQuery Full Text | EXPath Extensions | EXQuery Extensions | XSLT 2.0 | XForms 1.1 | XProc 1.0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|
BaseX | BSD | Java | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
eXist | GNU LGPL | Java | Partial | Partial | Yes | Proprietary | Proprietary | Yes | Yes | Yes | Yes | Yes |
MarkLogic Server | Commercial | C++ | No | Partial | Yes | Proprietary | Proprietary | No | No | Yes | Yes | No |
OpenText xDB | Commercial | Java | Partial | Partial | Yes | Yes | Yes | No | No | No | No | No |
Oracle Berkeley DB XML | Commercial | C/C++ | ||||||||||
Qizx | Commercial | Java | No | No | Yes | Yes | Yes | No | No | Yes | No | No |
Sedna | Apache License 2.0 | C/C++ |
For data-centric XML datasets, the unique and distinct keyword search method, namely, XDMA[14] for XML databases is designed and developed based on dual indexing and mutual summation.
Seamless Wikipedia browsing. On steroids.