Sedna LogoBackground Top
 
Home  |  Getting Started  |  Documentation  |  Demo  |  Download  |  Support 

2.2 XQuery Options and Extensions

2.2.1 Controlling Serialization

Serialization is the process of converting XML nodes evaluated by XQuery into a stream of characters. In Sedna serialization is carried out when the result of a query is returned to the user. You can control the serialization by setting the serialization parameters specified in [5]. Currently, Sedna supports the following serialization parameters:




Parameter nameValues Description






indent ”yes” or ”no” (default yes)Output indented



To set a serialization parameter, use the output option in a query prolog. The output option is in the Sedna namespace (http://www.modis.ispras.ru/sedna) which is the predefined namespace in Sedna so you can omit its declaration. The value of the output option must have the following structure ”parameter-name=value; parameter-name=value”. Consider the following examples:

declare namespace se = "http://www.modis.ispras.ru/sedna";  
declare option se:output "indent=yes";

As mentioned above, you may omit the Sedna namespace declaration:

declare option se:output "indent=yes";

2.2.2 Value Index-scan Functions

In the current version of Sedna, query executor does not use indices automatically. Use the following functions to enforce executor to employ indices.

index-scan ($title as xs:string, $value as xdt:anyAtomicType,  
           $mode as xs:string ) as node()*

The index-scan function scans the index with the $title name and returns the sequence of nodes which keys are equal (less than, greater than, greater or equal, less or equal) to the search value $value. An Sedna error is raised if the search value can not be cast to the atomic type of the index. The $mode parameter of the xs:string type is used to set the type of the scan. The value of the parameter must be equal to one of the following: ’EQ’ (equal), ’LT’(less than), ’GT’ (greater than), ’GE’ (greater or equal), ’LE’ (less or equal).

index-scan-between ($title as xs:string, $value1 as xdt:anyAtomicType,  
         $value2 as xdt:anyAtomicType, $range as xs:string )  
                               as node()*

The index-scan-between scans the index with the $title name and returns the sequence of nodes which keys belong to the interval (segment, left half-interval, right half-interval) between the first $value1 and second $value2 search values. An Sedna error is raised if the search values can not be cast to the atomic type of the index. This function provides the $range parameter of the xs:string type to set the type of the scan. The value of the string must be equal to one of the following: ’INT’ (interval), ’SEG’ (segment), ’HINTR’ (right half-interval), ’HINTL’ (left half-interval).

For example, to select the names of people who live in the London city employing the ”people” index defined in section 2.5.3, use the following expression:

index-scan("people", "London", "EQ")/name

2.2.3 Full-text Search Functions

Please read section 2.5.4 before reading this section.

In the current version of Sedna, query executor does not use full-text indecies automatically. Use the following functions to enforce executor to employ indices.

ftindex-scan($title as xs:string, $query as xs:string,  
             $options as xs:string) as node()*

The ftindex-scan function scans the full-text index with the $title name and returns the sequence of items which satisfy the $query. As full-text index is based on dtSearch [15], use the dtSearch request language [16] to specify the query. DtSearch options dtsSearchAnyWords or dtsSearchAllWords may be specified in $options.

For example, you can employ the ”articles” index defined in section 2.5.4 to select the titles of articles that contain word ”apple” but not ”pear”:

ftindex-scan("articles", "apple and not pear")/title

ftscan($seq as node()*, $query as xs:string, $type as xs:string,  
$customization_rules as xs:string) as node()*

The ftscan function returns those items of the input sequence $seq which satisfy the query $query. The function does not use indices and can be applied to any sequence of nodes, even those that are not indexed. The query $query is evaluated over the text representation constructed according to the $type and $customization_rules paramters. The values of the parameters are the same as those used when a full-text index is created (see section 2.5.4 for details).

For example, you can select the titles of articles that contain word ”apple” but not ”pear” without using indices and using special customization rules as follows:

ftscan(doc("foo")/library//article, "apple and not pear",  
"customized-value", (("b","string-value"),("a","delimited-value")))/title

2.2.4 SQL Connection

SQL Connection allows access to relational databases from XQuery using SQL. The resulting ralations are represented on-the-fly as sequenses of XML-elements representing rows. These elements have sub-elements corresponding with the columns returned by the SQL query. and thus can be easy processed in XQuery. All functions dealing with access to SQL data are located in the namespace http://modis.ispras.ru/Sedna/SQL” which is refered as sql in the following function declarations and examples.

Connections

In order to execute SQL queries on a RDBMS, you should first establish a connection to it. There are two functions for managing connections:

function sql:connect($db-url as xs:string) as xs:integer  
function sql:connect($db-url as xs:string, $user as xs:string)  
       as xs:integer  
function sql:connect($db-url as xs:string, $user as xs:string,  
      $password as xs:string) as xs:integer  
function sql:connect($db-url as xs:string, $user as xs:string,  
      $password as xs:string, $options as element()*) as xs:integer

These functions attempt to establish a database connection to the given URL using a user name and password if specified. They return a connection handle which could be then passed to sql:execute, sql:prepare, sql:close, sql:rollback, and sql:commit

If no connection can be opened, an Sedna error is raised.

All arguments of the sql:connect functions except for $db-url are optional:

  • $db-url is the URL of the database to which a connect is established. URL is of the following form:
    odbc:<driver name>:[//<server>[/<database>][;]][<driver options>]

    “;” after <database> or <server> is required if there are some driver options following it. Driver options must be in the following form: <option>=<value>{;<option>=<value>}. List of available options depens on the ODBC driver used. One of the common options is “Port” which is used to specify the port on which the database server is configured to listen.

    Example: odbc:MySQL ODBC 3.51 Driver://localhost/somedb;Port=1234

  • $user is your user name for the session.
  • $password is your password for the session.
  • $options is an optional sequence of connection options. Connection options are elements of the form
    <sql:option name="<option-name>" value="<option-value>"/>

    The only connection option available for the moment is “manual-commit” which enables manual commit mode if its value is “on

To disconnect from the database, you can use the following function:

function sql:close($connecton as xs:integer) as element()?

It closes database connection associated with connection handle $connection. An Sedna error is raised if operation cannot be completed.

Executing Queries

When a database connection is established you can start executing queries. Two types of query execution are supported: direct query execution and prepared query execution.

Direct Queries

Simple SQL queries are executed as the following XQuery example shows:

declare namespace sql="http://modis.ispras.ru/Sedna/SQL";  
let $connection :=  
   sql:connect("odbc:MySQL ODBC 3.51 Driver://localhost/somedb", "user",  
      "pass")  
return  
   sql:execute($connection, "SELECT * FROM people WHERE first = ’Peter’");

The result is as follows:

<tuple first="Peter" last="Jackson" city="Wellington"/>

There are two functions for direct query execution:

function sql:execute($connection as xs:integer, $statement as xs:string)  
   as element()*  
function sql:execute($connection as xs:integer, $statement as xs:string,  
      $query-options as element()*) as element()*

These functions execute a SQL query and return a sequence of elements representing the query result. SQL query can be as both a query statement and an update statement. In case of query statement, the result sequence contains an element named ’row’ for each row of the query result. Each element contains as many children attributes as there are non-NULL fields in the corresponding result-row. Each attribute has the name of a row field. Fields with NULL values are not included. In case of update statement, empty sequence is returned.

An Sedna error is raised on an erroneous statement.

The sql:execute have the following arguments:

  • $connection is a connection handle, returned by sql:connect;
  • $statement is a string containing SQL statement to be executed;
  • $query-options is a sequence of optional query parameters.

Update queries can be executed using the sql:exec-update function:

function sql:exec-update($connection as xs:integer,  
          $statement as xs:string) as xs:integer  
function sql:exec-update($connection as xs:integer,  
      $statement as xs:string,  
      $query-options as element()*) as xs:integer

these functions are similar to sql:execute, but return the number of rows affected by an update query (instead of an empty sequence returned by sql:execute for update-queries). Function arguments are same as for sql:execute.

The behaviour of this function is undefined for non-update queries.

Prepared Statements

Sometimes it is more convenient or more efficient to use prepared SQL statements instead of direct query execution. In most cases, when a SQL statement is prepared it will be sent to the DBMS right away, where it will be compiled. This means that the DBMS does not have to compile a prepared statement each time it is executed. Prepared statements can take parameters. This allows using the same statement and supply it with different values each time you execute it, as in the following XQuery example:

declare namespace sql="http://modis.ispras.ru/Sedna/SQL";  
let $connection :=  
   sql:connect("odbc:MySQL ODBC 3.51 Driver://localhost/somedb", "user",  
     "pass")  
let $statement :=  
   sql:prepare($connection, "INSERT INTO people(first, last)  
         VALUES (?, ?)")  
return (  
   sql:execute($statement, "John", "Smith"),  
   sql:execute($statement, "Matthew", "Barney")  
)

this XQuery code inserts two rows into table people and returns an empty sequence.

To use prepared statements, first you need to create a prepared statement handle using the sql:prepare function:

function sql:prepare($connection as xs:integer, $statement as xs:string)  
      as xs:integer  
function sql:prepare($connection as xs:integer, $statement as xs:string,  
      $query-options as element()*) as xs:integer

these function prepare a SQL statement for later execution and returns a prepated statement handle which can be used in the sql:execute and sql:exec-update functions.

The sql:prepare functions have the following arguments:

  • $connection is a connection handle, created by sql:connect;
  • $statement is a string containing a SQL statement that may contain one or more ’?’ - IN parameter placeholders;
  • $query-options is a sequence of optional query parameters.

An Sedna error is raised on an erroneous statement.

There are two prepared statement execution functions, similar to direct query execution:

function sql:execute($prepared-statement as xs:integer,  
      $param1 as item()?, ...) as element()*

this function is similar to sql:execute for direct queries and returns a sequence of elements representing query result.

  • $prepared-statement is a prepared statement handle created by $sql:prepare;
  • $param1, ... are parameters for parametrized statements. The number of parameters specified must exactly match the number of parameters of the prepared statement. NULL values are represended as empty sequences “()”.
function sql:exec-update($prepared-statement as xs:integer,  
      $param1 as item()?, $param2...) as xs:integer

This function is similar to sql:execute, but returns the number of rows affected by an update query (instead of an empty sequence returned by sql:execute for update-qeries). Function arguments are the same as for sql:execute.

The behaivor of this function is undefined for non-update queries.

Transactions

The default commit mode of connection is auto-commit, meaning that all updates will be commited automatically. If this is not desired behaviour, you can pass manual-commit option to sql:connect when you create a connection handle.

In manual commit mode you can specify when updates will be committed or rolled back:

declare namespace sql="http://modis.ispras.ru/Sedna/SQL";  
let $connection :=  
  sql:connect("odbc:MySQL ODBC 3.51 Driver://localhost/somedb", "user",  
         "pass",<sql:option name="manual-commit" value="1"/>)  
return  
  for $person in doc("somedoc.xml")/person  
  return (  
    sql:execute($connection, "<do something with person>"),  
    if (allok($connection, $person)) then  
      (  
        sql:execute($connection, "<do something else with person>"),  
        sql:commit($connection)  
      )  
    else  
      sql:rollback($connection)  
  )

There are two functions for specifying transaction boundaries - sql:commit and sql:rollback (transactions are started automatically by queries, these functions only close them):

function sql:commit($connecton as xs:integer) as element()?

sql:commit commits all changes made during the last transaction in the database connection specified by connection handle $connection and closes transaction.

An Sedna error is raised if operation cannot be completed.

function sql:rollback($connecton as xs:integer) as element()?

rolls back all changes made during the last transaction in the database connection specified by connection the handle $connection and closes transaction.

An Sedna error is raised if operation cannot be completed.

2.2.5 External Functions

External function is a notion defined in the XQuery specification [3] as follows: ”External functions are functions that are implemented outside the query environment”. Support for external functions allows you to extend XQuery by implementing functions in other languages.

Sedna provides a server programming API to write external functions in the C/C++ language. External functions in Sedna are limited to dealing with sequences of atomic values. External functions are compiled and linked in the form of shared libraries (i.e. .dll files in Windows or .so files in Linux/FreeBSD/Mac OS) and loaded by the server on demand. Although the Sedna XQuery executor evaluates queries in a lazy manner, all external function calls are evaluated in an eager manner.

Using External Functions

To use an external function you need to declare this function in prologue with external keyword instead of function body. Then it may be used normally:

declare function se:f($a as xs:integer) as $xs:integer external;  
f(10)

Creating External Functions

External functions must be written in C/C++. To implement a new XQuery function func you should write the following C (or C++) functions: func, func_init and func_deinit. When executor decides that it needs to use an external function, first it initializes this function by calling func_init, after that it will call func to compute results as many times as needed. When some external function is not needed anymore, executor calls func_deinit (which probaly will free any memory allocated by func_init). Each one of the three functions receives an SEDNA_EF_INIT1 structure as a parameter. This structure has several fields that are initialized by executor before any func_init or func_deinit calls:

typedef struct sedna_ef_init  
{  
    void *(*sedna_malloc)(size_t);  
    void (*sedna_free)(void *);  
    SEDNA_SEQUENCE_ITEM *item_buf;  
} SEDNA_EF_INIT;

The fields of this structure may be used in your implementation:

  • sedna_malloc is a pointer to a malloc function which must be used to allocate memory for function results, this memory will be automatically freed by the query executor. It may also be used to allocate memory for internal use, such memory must be freed manually using the sedna_free function.
  • sedna_free is a pointer to free function that releases memory allocated using sedna_malloc function.
  • item_buf is a pointer to a preallocated SEDNA_SEQUENCE_ITEM which may be used to store results (this allows to avoid using sedna_malloc function when result is a single atomic non-string value)

func, func_init and func_deinit must have specific signatures:

  • func() (required)

    computes external function results. This function has the following signature:

    SEDNA_SEQUENCE_ITEM *func(SEDNA_EF_INIT *init,  
                              SEDNA_EF_ARGS *args,  
                              char * error_msg_buf);

    • init is a pointer to the SEDNA_EF_INIT structure which was passed to func_init function (if written)
    • args is a pointer to the SEDNA_EF_ARGS structure which contains all function arguments
    • error_msg_buf is a pointer to the string buffer used for specifying error message if function invocation fails. Maximum message length is SEDNA_ERROR_MSG_BUF_SIZE bytes, including the null character ’\0’.
  • func_init() (optional)

    The initialization function. It can be used to allocate any memory required by the main function. This function has the following signature:

    void func_init(SEDNA_EF_INIT *init, char * error_msg_buf);

    • init is a pointer to the SEDNA_EF_INIT structure (the pointer to this structure will be passed then to func and func_deinit functions)
    • error_msg_buf is a pointer to the string buffer used for specifying error message if function invocation fails. Maximum message length is SEDNA_ERROR_MSG_BUF_SIZE, including the null character ’\0’.
  • func_deinit() (optional)

    The deinitialization function. It should deallocate any memory allocated by the initialization function. This function has the following signature:

    void func_init(SEDNA_EF_INIT *init, char * error_msg_buf);

    • init is a pointer to the SEDNA_EF_INIT structure which was passed to func_init function (if written)
    • error_msg_buf is a pointer to the string buffer used for specifying error message if function invocation fails. Maximum message length is SEDNA_ERROR_MSG_BUF_SIZE, including the null character ’\0’.

When func, func_init or func_deinit is executed error_msg_buf contains an empty string. If function is executed succesfully, it should not change this value. In case of error a non-empty string (error description) must be placed in error_msg_buf (if you place an empty string in error_msg_buf executor assumes that function execution was successful)

Each shared library must also export an null-terminated array with the names of the XQuery functions defined by this library:

char const *ef_names[] = { "func", NULL};

The file sedna_ef.h defines several types for representing function arguments and results:

  • SEDNA_ATOMIC_TYPE

    represents an atomic type:

    typedef enum sedna_atomic_type {  
         SEDNATYPE_integer,  
         SEDNATYPE_float,  
         SEDNATYPE_double,  
         SEDNATYPE_string  
    } SEDNA_ATOMIC_TYPE;

  • SEDNA_ATOMIC_VALUE

    represents an atomic value:

    typedef int     SEDNA_integer;  
    typedef float   SEDNA_float;  
    typedef double  SEDNA_double;  
    typedef char   *SEDNA_string;  
    typedef struct sedna_atomic_value  
    {  
        SEDNA_ATOMIC_TYPE type;  
        union {  
            SEDNA_integer val_integer;  
            SEDNA_float   val_float;  
            SEDNA_double  val_double;  
            SEDNA_string  val_string;  
        };  
    } SEDNA_ATOMIC_VALUE;

    Memory for values that are pointers (i.e. SEDNA_string) MUST be allocated using the malloc function passed in the SEDNA_EF_INIT structure.

  • SEDNA_SEQUENCE_ITEM represents a node in a linked list of atomic values:
    typedef struct sedna_sequence_item  
    {  
        SEDNA_ATOMIC_VALUE data;  
        struct sedna_sequence_item *next;  
    } SEDNA_SEQUENCE_ITEM;

    Linked lists are used to represent sequences of atomic values. An empty sequence is presented by a NULL pointer. If func needs to return a sequence of values, memory for nodes MUST be allocated using the malloc function passed in SEDNA_EF_INIT structure.

  • SEDNA_EF_ARGS represents an array of arguments passed to a function:
    typedef struct sedna_ef_args  
    {  
        int length;  
        SEDNA_SEQUENCE_ITEM **args;  
    } SEDNA_EF_ARGS;

Location of External Function Libraries

Compiled libraries must be placed in the directory lib that is (1) in the same directory where the directory data with database data is located or (2) in the directory <db_name>_files where database data are stored2 . Libraries that are database-independent should be placed in (1). Libraries that are database-specific should be placed in (2). Overloaded functions are not allowed. If two libraries located in (1) and (2) contain functions with the same name, a function from the library in (2) is called. If libraries in the same directory (1 or 2) contain functions with the same name, it is not specified which one is called.

There is a sample external function code available in the folder:

[win:] INSTALL_DIR\examples\api\external-functions\c\

[nix:] INSTALL_DIR/examples/api/external-functions/c/

where INSTALL_DIR refers to the directory where Sedna is installed.