Class S9
- java.lang.Object
-
- org.postgresql.pljava.example.saxon.S9
-
- All Implemented Interfaces:
org.postgresql.pljava.ResultSetProvider
,org.postgresql.pljava.ResultSetProvider.Large
public class S9 extends Object implements org.postgresql.pljava.ResultSetProvider.Large
Class illustrating use of XQuery with Saxon as the implementation, using its native "s9api".Supplies alternative, XML Query-based (as the SQL/XML standard dictates) implementation of some of SQL/XML, where the implementation in core PostgreSQL is limited to the capabilities of XPath (and XPath 1.0, at that).
Without the syntatic sugar built into the core PostgreSQL parser, calls to a function in this class can look a bit more verbose in SQL, but reflect a straightforward rewriting from the standard syntax. For example, suppose there is a table
catalog_as_xml
with a single row whosex
column is a (respectably sized) XML document recording the stuff inpg_catalog
. It could be created like this:CREATE TABLE catalog_as_xml(x) AS SELECT schema_to_xml('pg_catalog', false, true, '');
Functions/predicates from ISO 9075-14 SQL/XML
XMLQUERY
In the syntax of the SQL/XML standard, here is a query that would return an XML element representing the declaration of a function with a specified name:
SELECT XMLQUERY('/pg_catalog/pg_proc[proname eq $FUNCNAME]' PASSING BY VALUE x, 'numeric_avg' AS FUNCNAME RETURNING CONTENT EMPTY ON EMPTY) FROM catalog_as_xml;
It binds the 'context item' of the query to
x
, and theNAME
parameter to the given value, then evaluates the query and returns XML "CONTENT" (a tree structure with a document node at the root, but not necessarily meeting all the requirements of an XML "DOCUMENT"). It can be rewritten as this call to thexq_ret_content
method:SELECT javatest.xq_ret_content('/pg_catalog/pg_proc[proname eq $FUNCNAME]', PASSING => p, nullOnEmpty => false) FROM catalog_as_xml, LATERAL (SELECT x AS ".", 'numeric_avg' AS "FUNCNAME") AS p;
In the rewritten form, the form of result wanted (
RETURNING CONTENT
) is implicit in the called function name (xq_ret_content
), and the parameters to pass to the query are moved out to a separateSELECT
that supplies their values, types, and names (with the context item now given the name ".") and is passed by its alias into the query function.Because of an unconditional uppercasing that PL/Java's JDBC driver currently applies to column names, any parameter names, such as
FUNCNAME
above, must be spelled in uppercase where used in the XQuery text, or they will not be recognized. Because the unconditional uppercasing is highly likely to be dropped in a future PL/Java release, it is wisest until then to use only parameter names that really are uppercase, both in the XQuery text where they are used and in the SQL expression that supplies them. In PostgreSQL, identifiers that are not quoted are lowercased, so they must be both uppercase and quoted, in the SQL syntax, to be truly uppercase.In the standard, parameters and results (of XML types) can be passed
BY VALUE
orBY REF
, where the latter means that the same nodes will retain their XQuery node identities over calls (note that this is a meaning unrelated to what "by value" and "by reference" usually mean in PostgreSQL's documentation). PostgreSQL's implementation of the XML type provides no way forBY REF
semantics to be implemented, so everything happening here happensBY VALUE
implicitly, and does not need to be specified.XMLEXISTS
The function
xmlexists
here implements the standard function of the same name. Because it is the same name, it has to be either schema-qualified or double-quoted in a call to avoid confusion with the reserved word. In the syntax of the SQL/XML standard, here is a query returning a boolean value indicating whether a function with the specified name is declared:SELECT XMLEXISTS('/pg_catalog/pg_proc[proname eq $FUNCNAME]' PASSING BY VALUE x, 'numeric_avg' AS FUNCNAME) FROM catalog_as_xml;
It can be rewritten as this call to the
xmlexists
method:SELECT "xmlexists"('/pg_catalog/pg_proc[proname eq $FUNCNAME]', PASSING => p) FROM catalog_as_xml, LATERAL (SELECT x AS ".", 'numeric_avg' AS "FUNCNAME") AS p;
XMLTABLE
The function
xmltable
here implements (much of) the standard function of the same name. Because it is the same name, it has to be either schema-qualified or double-quoted in a call to avoid confusion with the reserved word. A rewritten form of the first example in the PostgreSQL manual could be:SELECT xmltable.* FROM xmldata, LATERAL (SELECT data AS ".", 'not specified'::text AS "DPREMIER") AS p, "xmltable"('//ROWS/ROW', PASSING => p, COLUMNS => ARRAY[ 'data(@id)', null, 'COUNTRY_NAME', 'COUNTRY_ID', 'SIZE[@unit eq "sq_km"]', 'concat(SIZE[@unit ne "sq_km"], " ", SIZE[@unit ne "sq_km"]/@unit)', 'let $e := PREMIER_NAME return if ( empty($e) )then $DPREMIER else $e' ]) AS ( id int, ordinality int8, "COUNTRY_NAME" text, country_id text, size_sq_km float, size_other text, premier_name text );
In the first column expression, without the
data()
function, the result would be a bare attribute node (one not enclosed in an XML element). Many implementations will accept a bare attribute as a column expression result, and simply assume the attribute's value is wanted, but it appears that a strict implementation of the spec must raiseerr:XPTY0004
in such a case. This implementation is meant to be strict, so the attribute is wrapped indata()
to extract and return its value. (See "About bare attribute nodes" inassignRowValues
for more explanation.)The
DPREMIER
parameter passed from SQL to the XQuery expression is spelled in uppercase (and also, in the SQL expression supplying it, quoted), for the reasons explained above for thexq_ret_content
function.XMLCAST
An ISO standard cast expression like
XMLCAST(v AS wantedtype)
can be rewritten with this idiom and thexmlcast
function provided here:(SELECT r FROM (SELECT v) AS o, xmlcast(o) AS (r wantedtype))
XQuery regular-expression functions in ISO 9075-2 Foundations
The methodslike_regex
,occurrences_regex
,position_regex
,substring_regex
, andtranslate_regex
provide, with slightly altered syntax, the ISO SQL predicate and functions of the same names.For the moment, they will only match newlines in the way W3C XQuery specifies, not in the more-flexible Unicode-compatible way ISO SQL specifies, and for the ones where ISO SQL allows
USING CHARACTERS
orUSING OCTETS
, onlyUSING CHARACTERS
will work.Extensions
XQuery module prolog allowed
Where any function here accepts an XQuery "expression" according to the SQL specification, in fact an XQuery "main module" will be accepted. Therefore, the query can be preceded by a prolog declaring namespaces, options, local variables and functions, etc.
Saxon extension to XQuery regular expressions
Saxon's implementation of XQuery regular expressions will accept a nonstandard flag string ending with
;j
to use Java regular expressions rather than XQuery ones. That extension is available in the XQuery regular-expression methods provided here.- Author:
- Chapman Flack
-
-
Method Summary
Modifier and Type Method Description boolean
assignRowValues(ResultSet receive, long currentRow)
Produce and return one row of theXMLTABLE
result table per call.void
close()
Called when PostgreSQL has no need for more rows of the tabular result.static boolean
like_regex(String value, String pattern, String flag, boolean w3cNewlines)
Function form of the ISO SQL<regex like predicate>
.static int
occurrences_regex(String pattern, String in, String flag, int from, boolean usingOctets, boolean w3cNewlines)
Syntax-sugar-free form of the ISO SQLOCCURRENCES_REGEX
function: how many times does a pattern occur in a string?static int
position_regex(String pattern, String in, String flag, int from, boolean usingOctets, boolean after, int occurrence, int group, boolean w3cNewlines)
Syntax-sugar-free form of the ISO SQLPOSITION_REGEX
function: where does a pattern, or part of it, occur in a string?static String
substring_regex(String pattern, String in, String flag, int from, boolean usingOctets, int occurrence, int group, boolean w3cNewlines)
Syntax-sugar-free form of the ISO SQLSUBSTRING_REGEX
function: return a substring specified by a pattern match in a string.static String
translate_regex(String pattern, String in, String flag, String with, int from, boolean usingOctets, int occurrence, boolean w3cNewlines)
Syntax-sugar-free form of the ISO SQLTRANSLATE_REGEX
function: return a string constructed from the input string by replacing one specified occurrence, or all occurrences, of a matching pattern.static boolean
xmlcast(ResultSet operand, Boolean base64, ResultSet target)
An implementation of XMLCAST.static Boolean
xmlexists(String expression, ResultSet passing, String[] namespaces)
An implementation ofXMLEXISTS(expression PASSING BY VALUE passing)
, using genuine XQuery.static org.postgresql.pljava.ResultSetProvider
xmltable(String rows, String[] columns, ResultSet passing, String[] namespaces, Boolean base64)
An implementation of (much of) XMLTABLE, using genuine XML Query.static SQLXML
xmltext(String sve)
PostgreSQL (as of 12) lacks the XMLTEXT function, so here it is.static SQLXML
xq_ret_content(String expression, Boolean nullOnEmpty, ResultSet passing, String[] namespaces)
A simple example corresponding toXMLQUERY(expression PASSING BY VALUE passing RETURNING CONTENT {NULL|EMPTY} ON EMPTY)
.
-
-
-
Method Detail
-
xmltext
@Function(schema="javatest") public static SQLXML xmltext(String sve) throws SQLException
PostgreSQL (as of 12) lacks the XMLTEXT function, so here it is.As long as PostgreSQL does not have the
XML(SEQUENCE)
type, this can only be theXMLTEXT(sve RETURNING CONTENT)
flavor, which does create a text node withsve
as its value, but returns the text node wrapped in a document node.This function doesn't actually require Saxon, but otherwise fits in with the theme here, implementing missing parts of SQL/XML for PostgreSQL.
- Parameters:
sve
- SQL string value to use in a text node- Returns:
- XML content, the text node wrapped in a document node
- Throws:
SQLException
-
xmlcast
@Function(schema="javatest", type="pg_catalog.record", onNullInput=CALLED, settings="IntervalStyle TO iso_8601") public static boolean xmlcast(ResultSet operand, @SQLType(defaultValue="false") Boolean base64, ResultSet target) throws SQLException
An implementation of XMLCAST.Will be declared to take and return type
RECORD
, where each must have exactly one component, just because that makes it easy to use existing JDBC metadata queries to find out the operand and target SQL data types.Serving suggestion: rewrite this ISO standard expression
XMLCAST(v AS wantedtype)
to this idiomatic one:(SELECT r FROM (SELECT v) AS o, xmlcast(o) AS (r wantedtype))
- Parameters:
operand
- a one-row, one-column record supplied by the caller, whose one typed value is the operand to be cast.base64
- true if binary SQL values should be base64-encoded in XML; if false (the default), values will be encoded in hex.target
- a one-row, one-column record supplied by PL/Java from theAS
clause after the function call, whose one column's type is the type to be cast to.- Throws:
SQLException
-
xq_ret_content
@Function(schema="javatest", onNullInput=CALLED, settings="IntervalStyle TO iso_8601") public static SQLXML xq_ret_content(String expression, Boolean nullOnEmpty, @SQLType(defaultValue={}) ResultSet passing, @SQLType(defaultValue={}) String[] namespaces) throws SQLException
A simple example corresponding toXMLQUERY(expression PASSING BY VALUE passing RETURNING CONTENT {NULL|EMPTY} ON EMPTY)
.- Parameters:
expression
- An XQuery expression. Must not benull
(in the SQL standardXMLQUERY
syntax, it is not even allowed to be an SQL expression at all, only a string literal).nullOnEmpty
- passtrue
to get a null return in place of an empty sequence, orfalse
to just get the empty sequence.passing
- A row value whose columns will be supplied to the query as parameters. Columns with names (typically supplied withAS
) appear as predeclared external variables with matching names (in no namespace) in the query, with types derived from the SQL types of the row value's columns. There may be one (and no more than one) column withAS "."
which, if present, will be bound as the context item. (The name?column?
, which PostgreSQL uses for an otherwise-unnamed column, is also accepted, which will often allow the context item to be specified with noAS
at all. Beware, though, that PostgreSQL likes to invent column names from any function or type name that may appear in the value expression, so this shorthand will not always work, whileAS "."
will.) PL/Java's internal JDBC uppercases all column names, so any uses of the corresponding variables in the query must have the names in upper case. It is safest to also uppercase their appearances in the SQL (for which, in PostgreSQL, they must be quoted), so that the JDBC uppercasing is not being relied on. It is likely to be dropped in a future PL/Java release.namespaces
- An even-length String array where, of each pair of consecutive entries, the first is a namespace prefix and the second is the URI to which to bind it. The zero-length prefix sets the default element and type namespace; if the prefix has zero length, the URI may also have zero length, to declare that unprefixed elements are in no namespace.- Throws:
SQLException
-
xmlexists
@Function(schema="javatest", onNullInput=CALLED, settings="IntervalStyle TO iso_8601") public static Boolean xmlexists(String expression, @SQLType(defaultValue={}) ResultSet passing, @SQLType(defaultValue={}) String[] namespaces) throws SQLException
An implementation ofXMLEXISTS(expression PASSING BY VALUE passing)
, using genuine XQuery.- Parameters:
expression
- An XQuery expression. Must not benull
(in the SQL standardXMLQUERY
syntax, it is not even allowed to be an SQL expression at all, only a string literal).passing
- A row value whose columns will be supplied to the query as parameters. Columns with names (typically supplied withAS
) appear as predeclared external variables with matching names (in no namespace) in the query, with types derived from the SQL types of the row value's columns. There may be one (and no more than one) column withAS "."
which, if present, will be bound as the context item. (The name?column?
, which PostgreSQL uses for an otherwise-unnamed column, is also accepted, which will often allow the context item to be specified with noAS
at all. Beware, though, that PostgreSQL likes to invent column names from any function or type name that may appear in the value expression, so this shorthand will not always work, whileAS "."
will.) PL/Java's internal JDBC uppercases all column names, so any uses of the corresponding variables in the query must have the names in upper case. It is safest to also uppercase their appearances in the SQL (for which, in PostgreSQL, they must be quoted), so that the JDBC uppercasing is not being relied on. It is likely to be dropped in a future PL/Java release.namespaces
- An even-length String array where, of each pair of consecutive entries, the first is a namespace prefix and the second is the URI to which to bind it. The zero-length prefix sets the default element and type namespace; if the prefix has zero length, the URI may also have zero length, to declare that unprefixed elements are in no namespace.- Returns:
- True if the expression evaluates to a nonempty sequence, false if it evaluates to an empty one. Null if a context item is passed and its SQL value is null.
- Throws:
SQLException
-
xmltable
@Function(schema="javatest", onNullInput=CALLED, settings="IntervalStyle TO iso_8601") public static org.postgresql.pljava.ResultSetProvider xmltable(String rows, String[] columns, @SQLType(defaultValue={}) ResultSet passing, @SQLType(defaultValue={}) String[] namespaces, @SQLType(defaultValue="false") Boolean base64) throws SQLException
An implementation of (much of) XMLTABLE, using genuine XML Query.The
columns
array must supply a valid XML Query expression for every column in the column definition list that follows the call of this function in SQL, except that the column for ordinality, if wanted, is identified by anull
entry incolumns
. Syntax sugar in the standard allows an omitted column expression to imply an element test for an element with the same name as the column; that doesn't work here.For now, this implementation lacks the ability to specify defaults for when a column expression produces an empty sequence. It is possible to do defaults explicitly by rewriting a query expression expr as
let $e :=
exprreturn if(empty($e))then $D else $e
and supplying the default D as another query parameter, though such defaults will be evaluated only once whenxmltable
is called and will not be able to refer to other values in an output row.- Parameters:
rows
- The single XQuery expression whose result sequence generates the rows of the resulting table. Must not be null.columns
- Array of XQuery expressions, exactly as many as result columns in the column definition list that follows the SQL call to this function. This array must not be null. It is allowed for one element (and no more than one) to be null, marking the corresponding column to be "FOR ORDINALITY" (the column must be of "exact numeric with scale zero" type; PostgreSQL supports 64-bit row counters, soint8
is recommended).passing
- A row value whose columns will be supplied to the query as parameters, just as described forxq_ret_content()
. If a context item is supplied, it is the context item for therows
query (thecolumns
queries get their context item from therows
query's result). Any named parameters supplied here are available both in therows
expression and (though this goes beyond the standard) in every expression ofcolumns
, with their values unchanging from row to row.namespaces
- An even-length String array where, of each pair of consecutive entries, the first is a namespace prefix and the second is to URI to which to bind it, just as described forxq_ret_content()
.base64
- whether the effective, in-scope 'xmlbinary' setting calls for base64 or (the default, false) hexadecimal.- Throws:
SQLException
-
close
public void close()
Called when PostgreSQL has no need for more rows of the tabular result.- Specified by:
close
in interfaceorg.postgresql.pljava.ResultSetProvider
-
assignRowValues
public boolean assignRowValues(ResultSet receive, long currentRow) throws SQLException
Produce and return one row of theXMLTABLE
result table per call.The row expression has already been compiled and its evaluation begun, producing a sequence iterator. The column XQuery expressions have all been compiled and are ready to evaluate, and the compiler's static analysis has bounded the data types they will produce. Because of the way the set-returning function protocol works, we don't know the types of the SQL output columns yet, until the first call of this function, when the
receive
parameter'sResultSetMetaData
can be inspected to find out. So that will be the first thing done when called withcurrentRow
of zero.Each call will then: (a) get the next value from the row expression's sequence iterator, then for each column, (b) evaluate that column's XQuery expression on the row value, and (c) assign that column's result to the SQL output column, casting to the proper type (which the SQL/XML spec has very exacting rules on how to do).
A note before going any further: this implementation, while fairly typical of a PostgreSQL set-returning user function, is not the way the SQL/XML spec defines
XMLTABLE
. The official behavior ofXMLTABLE
is defined in terms of a rewriting, at the SQL level, into a much-expanded SQL query where each result column appears as anXMLQUERY
call applying the column expression, wrapped in anXMLCAST
to the result column type (with aCASE WHEN XMLEXISTS
thrown in to support column defaults).As an ordinary user function, this example cannot rely on any fancy query rewriting during PostgreSQL's parse analysis. The slight syntax desugaring needed to transform a standard
XMLTABLE
call into a call of this "xmltable" is not too hard to learn and do by hand, but no one would ever want to write out by hand the whole longwinded "official" expansion prescribed in the spec. So this example is a compromise.The main thing lost in the compromise is the handling of column defaults. The full rewriting with per-column SQL expressions means that each column default expression can be evaluated exactly when/if needed, which is often the desired behavior. This implementation as an ordinary function, whose arguments all get evaluated ahead of the call, can't really do that. Otherwise, there's nothing in the spec that's inherently unachievable in this implementation.
Which brings us to the matter of casting each column expression result to the proper type for its SQL result column.
Like any spec,
SQL/XML
does not mandate that an implementation must be done in exactly the way presented in the spec (rewritten so each column value is produced by anXMLQUERY
wrapped in anXMLCAST
). The requirement is to produce the equivalent result.A look at the rewritten query shows that each column XQuery result value must be representable as some value in SQL's type system, not once, but twice: first as the result returned by
XMLQUERY
and passed along toXMLCAST
, and finally with the output column's type as the result of theXMLCAST
.Now, the output column type can be whatever is wanted. Importantly, it can be either an XML type, or any ordinary SQL scalar type, like a
float
or adate
. Likewise, the XQuery column expression may have produced some atomic value (like anxs:double
orxs:date
), or some XML node, or any sequence of any of those.What are the choices for the type in the middle: the SQL value returned by
XMLQUERY
and passed on toXMLCAST
?There are two. An ISO-standard SQL
XMLQUERY
can specifyRETURNING SEQUENCE
orRETURNING CONTENT
. The first option produces the typeXML(SEQUENCE)
, a useful type that PostgreSQL does not currently have.XML(SEQUENCE)
can hold exactly whatever an XQuery expression can produce: a sequence of any length, of any mixture of atomic values and XML nodes (even such oddities as attribute nodes outside of any element), in any order. AnXML(SEQUENCE)
value need not look anything like what "XML" normally brings to mind.With the other option,
RETURNING CONTENT
, the result ofXMLQUERY
has to be something that PostgreSQL'sxml
type could store: a serialized document with XML structure, but without the strict requirements of exactly one root element with no text outside it. At the limit, a completely non-XMLish string of ordinary text is perfectly acceptable XMLCONTENT
, as long as it uses the right&...;
escapes for any characters that could look like XML markup.XMLCAST
is able to accept either form as input, and deliver it to the output column as whatever type is needed. But the spec leaves no wiggle room as to which form to use:- If the result column type is
XML(SEQUENCE)
, then theXMLQUERY
is to specifyRETURNING SEQUENCE
. It produces the column's result type directly, so theXMLCAST
has nothing to do. - In every other case (every other case), the
XMLQUERY
is to specifyRETURNING CONTENT
.
At first blush, that second rule should sound crazy. Imagine a column definition like
growth float8 PATH 'math:pow(1.0 + $RATE, count(year))'
The expression produces anxs:double
, which can be assigned directly to a PostgreSQLfloat8
, but the rule in the spec will have it first converted to a decimal string representation, made into a text node, wrapped in a document node, and returned as XML, to be passed along toXMLCAST
, which parses it, discards the wrapping document node, parses the text content as a double, and returns that as a proper value of the result column type (which, in this example, it already is).The spec does not go into why this rule was chosen. The only rationale that makes sense to me is that the
XML(SEQUENCE)
data type is an SQL feature (X190) that not every implementation will support, so the spec has to defineXMLTABLE
using a rewritten query that can work on systems that do not have that type. (PostgreSQL itself, at present, does not have it.)The first rule, when
XML(SEQUENCE)
is the result column type, will naturally never be in play except on a system that has that type, in which case it can be used directly. But even such a system must still produce, in all other cases, results that match what a system without that type would produce. All those cases are therefore defined as if going the long way throughXML(CONTENT)
.Whenever the XQuery expression can be known to produce a (possibly empty or) singleton sequence of an atomic type, the long round trip can be shown to be idempotent, and we can skip right to casting the atomic type to the SQL result column type. A few other cases could be short-circuited the same way. But in general, for cases involving nodes or non-singleton sequences, it is safest to follow the spec punctiliously; the steps are defined in terms of XQuery constructs like
document {...}
anddata()
, which have specs of their own with many traps for the unwary, and the XQuery library provides implementations of them that are already tested and correct.Though most of the work can be done by the XQuery library, it may be helpful to look closely at just what the specification entails.
Again, but for the case of an
XML(SEQUENCE)
result column, in all other cases the result must pass throughXMLQUERY(... RETURNING CONTENT EMPTY ON EMPTY)
. That, in turn, is defined as equivalent toXMLQUERY(... RETURNING SEQUENCE)
with the result then passed toXMLDOCUMENT(... RETURNING CONTENT)
, whose behavior is that of a document node constructor in XQuery, with construction modepreserve
. The first step of that behavior is the same as Step 1e in the processing of direct element constructor content. The remaining steps are those laid out for the document node constructor.Clarity demands flattening this nest of specifications into a single ordered list of the steps to apply:
- Any item in the sequence that is an array is flattened (its elements become items in the sequence).
- If any item is a function,
err:XQTY0105
is raised. - Any sequence
$s
of adjacent atomic values is replaced bystring-join($s, ' ')
. - Any XML node in the sequence is copied (as detailed in the spec).
- After all the above, any document node that may exist in the resulting sequence is flattened (replaced by its children).
- A single text node is produced for any run of adjacent text nodes in the sequence (including any that have newly become adjacent by the flattening of document nodes), by concatenation with no separator (unlike the earlier step where atomic values were concatenated with a space as the separator).
- If the sequence directly contains any attribute or namespace node,
err:XPTY0004
is raised. More on this below. - The sequence resulting from the preceding steps is wrapped in one new document node (as detailed in the spec).
At this point, the result could be returned to SQL as a value of
XML(CONTENT(ANY))
type, to be passed to anXMLCAST
invocation. This implementation avoids that, and simply proceeds with the existing Java in-memory representation of the document tree, to the remaining steps entailed in anXMLCAST
to the output column type:- If the result column type is an XML type, rewriting would turn the
XMLCAST
into a simpleCAST
and that's that. Otherwise, the result column has some non-XML, SQL type, and: - The algorithm "Removing XQuery document nodes from an XQuery sequence" is applied. By construction, we know the only such node is the one the whole sequence was recently wrapped in, two steps ago (you get your house back, you get your dog back, you get your truck back...).
- That sequence of zero or more XML nodes is passed to the
fn:data
function, producing a sequence of zero or more atomic values, which will
all have type
xs:untypedAtomic
(because the document-wrapping stringified any original atomic values and wrapped them in text nodes, for which the typed-value isxs:untypedAtomic
by definition). This sequence also has cardinality zero-or-more, and may be shorter or longer than the original. - If the sequence is empty, the result column is assigned
NULL
(or the column's default value, if one was specified). Otherwise, the sequence is known to have length one or more, and: - The spec does not say this (which may be an oversight or bug), but the
sequence must be checked for length greater than one, raising
err:XPTY0004
in that case. The following steps require it to be a singleton. - It is labeled as a singleton sequence of
xs:anyAtomicType
and used as input to an XQuerycast as
expression. (Alternatively, it could be labeled a one-or-more sequence ofxs:anyAtomicType
, leaving the length check to be done bycast as
, which would raise the same errorerr:XPTY0004
, if longer than one.) - The
cast as
is to the XQuery type determined as indetermineXQueryFormalType
below, based on the SQL type of the result column; or, if the SQL type is a date/time type with no time zone, there is a firstcast as
to a specific XSD date/time type, which is (if it has a time zone) first adjusted to UTC, then stripped of its time zone, followed by a secondcast as
from that type to the one determined from the result column type. Often, that will be the same type as was used for the time zone adjustment, and the secondcast as
will have nothing to do. - The XQuery value resulting from the cast is converted and assigned to the SQL-typed result column, a step with many details but few surprises, therefore left for the morbidly curious to explore in the code. The flip side of the time zone removal described above happens here: if the SQL column type expects a time zone and the incoming value lacks one, it is given a zone of UTC.
The later steps above, those following the length-one check, are handled by
xmlCastAsNonXML
below.The earlier steps, from the start through the
XMLCAST
early steps of document-node unwrapping, can all be applied by letting the original result sequence be$EXPR
in the expression:declare construction preserve; data(document { $EXPR } / child::node())
which may seem a bit of an anticlimax after seeing how many details lurk behind those tidy lines of code.About bare attribute nodes
One consequence of the rules above deserves special attention. Consider something like:
XMLTABLE('.' PASSING '<a foo="bar"/>' COLUMNS c1 VARCHAR PATH 'a/@foo');
The result of the column expression is an XML attribute node all on its own, with name
foo
and valuebar
, not enclosed in any XML element. In the data typeXML(SEQUENCE)
, an attribute node can appear standalone like that, but not inXML(CONTENT)
.Db2, Oracle, and even the XPath-based pseudo-XMLTABLE built into PostgreSQL, will all accept that query and produce the result "bar".
However, a strict interpretation of the spec cannot produce that result, because the result column type (
VARCHAR
) is notXML(SEQUENCE)
, meaning the result must be as if passed throughXMLDOCUMENT(... RETURNING CONTENT)
, and the XQuerydocument { ... }
constructor is required to raiseerr:XPTY0004
upon encountering any bare attribute node. The apparently common, convenient behavior of returning the attribute node's value component is not, strictly, conformant.This implementation will raise
err:XPTY0004
. That can be avoided by simply wrapping any such bare attribute indata()
:... COLUMNS c1 VARCHAR PATH 'a/data(@foo)');
It is possible the spec has an editorial mistake and did not intend to require an error for this usage, in which case this implementation can be changed to match a future clarification of the spec.
- Specified by:
assignRowValues
in interfaceorg.postgresql.pljava.ResultSetProvider
- Specified by:
assignRowValues
in interfaceorg.postgresql.pljava.ResultSetProvider.Large
- Throws:
SQLException
- If the result column type is
-
like_regex
@Function(schema="javatest") public static boolean like_regex(String value, String pattern, @SQLType(defaultValue="") String flag, @SQLType(defaultValue="false") boolean w3cNewlines) throws SQLException
Function form of the ISO SQL<regex like predicate>
.Rewrite the standard form
value LIKE_REGEX pattern FLAG flags
into this form:like_regex(value, pattern, flag => flags)
where theflag
parameter defaults to no flags if omitted.The SQL standard specifies that pattern elements sensitive to newlines (namely
^
,$
,\s
,\S
, and.
) are to support the various representations of newline set out in Unicode Technical Standard #18, RL1.6. That behavior differs from the standard W3C XQuery newline handling, as described for the flagsm
ands
and for the multicharacter escapes\s
and\S
. As an extension to ISO SQL, passingw3cNewlines => true
requests the standard W3C XQuery behavior rather than the UTS#18 behevior for newlines. If the underlying XQuery library only provides the W3C behavior, calls withoutw3cNewlines => true
will throw exceptions.- Parameters:
value
- The string to be tested against the pattern.pattern
- The XQuery regular expression.flag
- Optional string of flags adjusting the regular expression behavior.w3cNewlines
- Pass true to allow the regular expression to recognize newlines according to the W3C XQuery rules rather than those of ISO SQL.- Returns:
- True if the supplied value matches the pattern. Null if any parameter is null.
- Throws:
SQLException
- SQLDataException with SQLSTATE 2201S if the regular expression is invalid, 2201T if the flags string is invalid; SQLFeatureNotSupportedException (0A000) if (in the current implementation) w3cNewlines is false or omitted.
-
occurrences_regex
@Function(schema="javatest") public static int occurrences_regex(String pattern, @SQLType(name="\"in\"") String in, @SQLType(defaultValue="") String flag, @SQLType(name="\"from\"",defaultValue="1") int from, @SQLType(defaultValue="false") boolean usingOctets, @SQLType(defaultValue="false") boolean w3cNewlines) throws SQLException
Syntax-sugar-free form of the ISO SQLOCCURRENCES_REGEX
function: how many times does a pattern occur in a string?Rewrite the standard form
OCCURRENCES_REGEX(pattern FLAG flags IN str FROM position USING units)
into this form:occurrences_regex(pattern, flag => flags, "in" => str, "from" => position, usingOctets => true|false)
where all of the named parameters are optional except pattern and "in", and the standardUSING CHARACTERS
becomesusingOctets => false
, which is the default, andUSING OCTETS
becomesusingOctets => true
. See alsolike_regex
regarding thew3cNewlines
parameter.- Parameters:
pattern
- XQuery regular expression to seek in the input string.in
- The input string.flag
- Optional string of flags adjusting the regular expression behavior.from
- Starting position in the input string, 1 by default.usingOctets
- Whether position is counted in characters (actual Unicode characters, not any smaller encoded unit, not even Java char), which is the default, or (when true) in octets of the string's encoded form.w3cNewlines
- Pass true to allow the regular expression to recognize newlines according to the W3C XQuery rules rather than those of ISO SQL.- Returns:
- The number of occurrences of the pattern in the input string, starting from the specified position. Null if any parameter is null; -1 if the start position is less than 1 or beyond the end of the string.
- Throws:
SQLException
- SQLDataException with SQLSTATE 2201S if the regular expression is invalid, 2201T if the flags string is invalid; SQLFeatureNotSupportedException (0A000) if (in the current implementation) usingOctets is true, or w3cNewlines is false or omitted.
-
position_regex
@Function(schema="javatest") public static int position_regex(String pattern, @SQLType(name="\"in\"") String in, @SQLType(defaultValue="") String flag, @SQLType(name="\"from\"",defaultValue="1") int from, @SQLType(defaultValue="false") boolean usingOctets, @SQLType(defaultValue="false") boolean after, @SQLType(defaultValue="1") int occurrence, @SQLType(name="\"group\"",defaultValue="0") int group, @SQLType(defaultValue="false") boolean w3cNewlines) throws SQLException
Syntax-sugar-free form of the ISO SQLPOSITION_REGEX
function: where does a pattern, or part of it, occur in a string?Rewrite the standard forms
POSITION_REGEX(START pattern FLAG flags IN str FROM position OCCURRENCE n GROUP m) POSITION_REGEX(AFTER pattern FLAG flags IN str FROM position OCCURRENCE n GROUP m)
into these forms, respectively:position_regex(pattern, flag => flags, "in" => str, "from" => position, occurrence => n, "group" => m) position_regex(pattern, flag => flags, "in" => str, "from" => position, occurrence => n, "group" => m, after => true)
where all of the named parameters are optional except pattern and "in". See alsooccurrences_regex
regarding theusingOctets
parameter, andlike_regex
regardingw3cNewlines
.- Parameters:
pattern
- XQuery regular expression to seek in the input string.in
- The input string.flag
- Optional string of flags adjusting the regular expression behavior.from
- Starting position in the input string, 1 by default.usingOctets
- Whether position is counted in characters (actual Unicode characters, not any smaller encoded unit, not even Java char), which is the default, or (when true) in octets of the string's encoded form.after
- Whether to return the position where the match starts (when false, the default), or just after the match ends (when true).occurrence
- If specified as an integer n (default 1), returns the position starting (or after) the nth match of the pattern in the string.group
- If zero (the default), returns the position starting (or after) the match of the whole pattern overall, otherwise if an integer m, the position starting or after the mth parenthesized group in (the nth occurrence of) the pattern.w3cNewlines
- Pass true to allow the regular expression to recognize newlines according to the W3C XQuery rules rather than those of ISO SQL.- Returns:
- The position, in the specified units, starting or just after, the nth occurrence (or mth capturing group of the nth occurrence) of the pattern in the input string, starting from the specified position. Null if any parameter is null; zero if the start position is less than 1 or beyond the end of the string, if occurrence is less than 1 or greater than the number of matches, or if group is less than zero or greater than the number of parenthesized capturing groups in the pattern.
- Throws:
SQLException
- SQLDataException with SQLSTATE 2201S if the regular expression is invalid, 2201T if the flags string is invalid; SQLFeatureNotSupportedException (0A000) if (in the current implementation) usingOctets is true, or w3cNewlines is false or omitted.
-
substring_regex
@Function(schema="javatest") public static String substring_regex(String pattern, @SQLType(name="\"in\"") String in, @SQLType(defaultValue="") String flag, @SQLType(name="\"from\"",defaultValue="1") int from, @SQLType(defaultValue="false") boolean usingOctets, @SQLType(defaultValue="1") int occurrence, @SQLType(name="\"group\"",defaultValue="0") int group, @SQLType(defaultValue="false") boolean w3cNewlines) throws SQLException
Syntax-sugar-free form of the ISO SQLSUBSTRING_REGEX
function: return a substring specified by a pattern match in a string.Rewrite the standard form
SUBSTRING_REGEX(pattern FLAG flags IN str FROM position OCCURRENCE n GROUP m)
into this form:substring_regex(pattern, flag => flags, "in" => str, "from" => position, occurrence => n, "group" => m)
where all of the named parameters are optional except pattern and "in". See alsoposition_regex
regarding theoccurrence
and"group"
parameters,occurrences_regex
regardingusingOctets
, andlike_regex
regardingw3cNewlines
.- Parameters:
pattern
- XQuery regular expression to seek in the input string.in
- The input string.flag
- Optional string of flags adjusting the regular expression behavior.from
- Starting position in the input string, 1 by default.usingOctets
- Whether position is counted in characters (actual Unicode characters, not any smaller encoded unit, not even Java char), which is the default, or (when true) in octets of the string's encoded form.occurrence
- If specified as an integer n (default 1), returns the nth match of the pattern in the string.group
- If zero (the default), returns the match of the whole pattern overall, otherwise if an integer m, the match of the mth parenthesized group in (the nth occurrence of) the pattern.w3cNewlines
- Pass true to allow the regular expression to recognize newlines according to the W3C XQuery rules rather than those of ISO SQL.- Returns:
- The substring matching the nth occurrence (or mth capturing group of the nth occurrence) of the pattern in the input string, starting from the specified position. Null if any parameter is null, if the start position is less than 1 or beyond the end of the string, if occurrence is less than 1 or greater than the number of matches, or if group is less than zero or greater than the number of parenthesized capturing groups in the pattern.
- Throws:
SQLException
- SQLDataException with SQLSTATE 2201S if the regular expression is invalid, 2201T if the flags string is invalid; SQLFeatureNotSupportedException (0A000) if (in the current implementation) usingOctets is true, or w3cNewlines is false or omitted.
-
translate_regex
@Function(schema="javatest") public static String translate_regex(String pattern, @SQLType(name="\"in\"") String in, @SQLType(defaultValue="") String flag, @SQLType(name="\"with\"",defaultValue="") String with, @SQLType(name="\"from\"",defaultValue="1") int from, @SQLType(defaultValue="false") boolean usingOctets, @SQLType(defaultValue="0") int occurrence, @SQLType(defaultValue="false") boolean w3cNewlines) throws SQLException
Syntax-sugar-free form of the ISO SQLTRANSLATE_REGEX
function: return a string constructed from the input string by replacing one specified occurrence, or all occurrences, of a matching pattern.Rewrite the standard forms
TRANSLATE_REGEX(pattern FLAG flags IN str WITH repl FROM position OCCURRENCE ALL) TRANSLATE_REGEX(pattern FLAG flags IN str WITH repl FROM position OCCURRENCE n)
into these forms, respectively:translate_regex(pattern, flag => flags, "in" => str, "with" => repl, "from" => position) translate_regex(pattern, flag => flags, "in" => str, "with" => repl, "from" => position, occurrence => n)
where all of the named parameters are optional except pattern and "in" (the default for "with" is the empty string, resulting in matches being deleted). See alsoposition_regex
regarding theoccurrence
parameter,occurrences_regex
regardingusingOctets
, andlike_regex
regardingw3cNewlines
.For the specified occurrence (or all occurrences), the matching portion s of the string is replaced as by the XQuery function replace(s, pattern, repl, flags). The repl string may contain
$0
to refer to the entire matched substring, or$
m to refer to the mth parenthesized capturing group in the pattern.- Parameters:
pattern
- XQuery regular expression to seek in the input string.in
- The input string.flag
- Optional string of flags adjusting the regular expression behavior.with
- The replacement string, possibly with $m references.from
- Starting position in the input string, 1 by default.usingOctets
- Whether position is counted in characters (actual Unicode characters, not any smaller encoded unit, not even Java char), which is the default, or (when true) in octets of the string's encoded form.occurrence
- If specified as an integer n (default 0 for "ALL"), replace the nth match of the pattern in the string.w3cNewlines
- Pass true to allow the regular expression to recognize newlines according to the W3C XQuery rules rather than those of ISO SQL.- Returns:
- The input string with one occurrence or all occurences of the pattern replaced, as described above. Null if any parameter is null, or if the start position is less than 1 or beyond the end of the string. The input string unchanged if occurrence is less than zero or exceeds the number of matches.
- Throws:
SQLException
- SQLDataException with SQLSTATE 2201S if the regular expression is invalid, 2201T if the flags string is invalid; 2201U if replacing where the pattern has matched a substring of zero length; 2201V if the replacement string has improper form (a backslash must be used to escape any dollar sign or backslash intended literally); SQLFeatureNotSupportedException (0A000) if (in the current implementation) usingOctets is true, or w3cNewlines is false or omitted.
-
-