XQuery is XML processing language similar to XSLT, except XQuery had at least enough sense to use real syntax instead of trying to code in XML.
Let's see how it turned out.
Hello, World!
I'll be using BaseX implementation of XQuery, you can get it with brew install basex
.
First let's create a simple document hello.xml
:
<?xml version="1.0" ?>
<persons>
<person>
<name>Alice</name>
</person>
<person>
<name>Bob</name>
</person>
</persons>
Then with this hello.xquery
:
<messages>
{
for $name in doc("hello.xml")//name
return <message>Hello, {data($name)}!</message>
}
</messages>
We can get our answer:
$ basex hello.xquery
<messages>
<message>Hello, Alice!</message>
<message>Hello, Bob!</message>
</messages>
By default it lacks XML header, and the final newline.
What we can notice here:
- we did not pass any document to process like XSLT, XQuery script specified which documents it wanted to open with
doc("name.xml")
- we can do any XPath with
doc("hello.xml")//name
- we can switch between XML and code, XML tags start XML mode,
{...}
starts code mode - the code of XQuery is the FLWOR (
for
let
where
order by
return
) statement - variables use
$
prefix - to get text content of a node
$name
, usedata($name)
- otherwise XQuery would insert<name>Alice</name>
Loop
The thing we're selecting doesn't have to be XML, and thing we're generating doesn't have to be XML. However, XQuery normally will insert newlines between returned elements, but not after the last one.
(: This is XQuery loop :)
for $n in (1 to 20)
return $n
$ basex loop.xquery
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
XQuery comments use unusual (: ... :)
syntax.
FizzBuzz
for $n in (1 to 100)
let $fizz := $n mod 3 = 0
let $buzz := $n mod 5 = 0
return (
if ($fizz and $buzz)
then "FizzBuzz"
else if ($buzz)
then "Buzz"
else if ($fizz)
then "Fizz"
else $n
)
This almost works, except it lacks the final newline after the last element.
FizzBuzz with correct newlines
If we want to take control over spacing, this becomes more complicated.
declare option output:method "text";
declare option output:item-separator "";
for $n in (1 to 100)
let $fizz := $n mod 3 = 0
let $buzz := $n mod 5 = 0
return (
if ($fizz and $buzz)
then "FizzBuzz "
else if ($buzz)
then "Buzz "
else if ($fizz)
then "Fizz "
else concat($n, " ")
)
- we need to switch
output:method
totext
- we need to switch
output:item-separator
to empty string - we do not want a separator, we want a terminator for each element - it might seem we could get away with just leaving things as they are, and adding
at the end - this is fine for FizzBuzz, but it would be incorrect in case of empty result set, so it's not really a great practice -
or equivalent XML escape generates a newline - there no string interpolation, we need to use
concat
Fibonacci
There are a few more serialization methods like csv
and json
. Let's try csv
serialization for Fibonacci numbers.
declare option output:method "csv";
declare option output:csv "header=yes";
declare function local:fib($i as xs:integer) as xs:integer {
if ($i <= 2)
then 1
else local:fib($i - 1) + local:fib($i - 2)
};
<csv>{
for $n in (1 to 30)
return <record>
<N>{$n}</N>
<Fib>{local:fib($n)}</Fib>
</record>
}</csv>
Here's what it does:
$ basex fib.xquery
N,Fib
1,1
2,1
3,2
4,3
5,5
6,8
7,13
8,21
9,34
10,55
11,89
12,144
13,233
14,377
15,610
16,987
17,1597
18,2584
19,4181
20,6765
21,10946
22,17711
23,28657
24,46368
25,75025
26,121393
27,196418
28,317811
29,514229
30,832040
Step by step:
- we switch to CSV output mode with
declare option output:method "csv";
- we turn on headers with
declare option output:csv "header=yes";
- while we're actually going to output CSV, we need to pretend we're generating XML
- the
<csv>
and<record>
tag names are not special, anything will do -
declare function local:fib($i as xs:integer) as xs:integer { ... }
declares a function and all the types it takes and returns - we need to put the function in some namespace,
local:
is a reasonable choice here; justdeclare function fib(...)
wouldn't work. - there's no
return
inside the function, it's just an expression - XQueryreturn
has very little to do with whatreturn
means in other languages
JSON Input
For this I got Cat Facts JSON.
<catfacts>{
for $name in json:doc("catfacts.json")//text
return <fact>{data($name)}</fact>
}</catfacts>
$ basex catfacts.xquery
<catfacts>
<fact>Cats make about 100 different sounds. Dogs make only about 10.</fact>
<fact>Domestic cats spend about 70 percent of the day sleeping and 15 percent of the day grooming.</fact>
<fact>I don't know anything about cats.</fact>
<fact>The technical term for a cat’s hairball is a bezoar.</fact>
<fact>Cats are the most popular pet in the United States: There are 88 million pet cats and 74 million dogs.</fact>
</catfacts>
json:doc
parses JSON document, and turns it into some XML-like structure with some weird tags like <verified type="boolean">true</verified>
, <json type="array">
, <status type="object">
etc.
It looks really weird when printed as XML, but it's not too bad to query it like it's XML.
JSON Output
And of course we can do it the other way. It might seem like I'm doing a lot of CSV and JSON in an XML episode, but that's likely quite representative of the real world. XML based systems are a sizable minority, and a lot of transformation tasks will be about getting data into and out of XML.
For example this exciting code:
declare option output:method "json";
<json type="array">{
for $n in (1 to 10)
return <_ type="object">
<number type="number">{$n}</number>
<odd type="boolean">true</odd>
<even type="boolean">false</even>
</_>
}</json>
Generates the following:
$ basex oddeven.xquery
[
{
"number":1,
"odd":true,
"even":false
},
{
"number":2,
"odd":true,
"even":false
},
{
"number":3,
"odd":true,
"even":false
},
{
"number":4,
"odd":true,
"even":false
},
{
"number":5,
"odd":true,
"even":false
},
{
"number":6,
"odd":true,
"even":false
},
{
"number":7,
"odd":true,
"even":false
},
{
"number":8,
"odd":true,
"even":false
},
{
"number":9,
"odd":true,
"even":false
},
{
"number":10,
"odd":true,
"even":false
}
]
You could even use XQuery to process inputs and outputs neither of which are XML. Turn some JSON into CSV or whatnot. It would basically construct those intermediate XMLs when loading data or before saving it.
Should you use XQuery?
I'd not recommend it for most people.
However, unlike XSLT, which is completely insane, XQuery is a legitimate tool created by some sane people for a legitimate purpose.
It's just that for vast majority of people XML processing is not something common enough to warrant learning a whole new language, and general purpose programming languages tend to be pretty much just as concise and expressive at transforming XML into other XWL, while being so much better at all the associated tasks like fetching data from the internet or databases, dealing with JSON or CSV, and any nontrivial data transformation. Especially Ruby and Nokogiri are far better than any of the XML-specific languages, but Python and others are totally adequate as well.
This is different from jq
, which I definitely recommend, as JSON processing is a far more common task, and jq
is extremely concise for shell one liners. In a way what jq
does is closer to XPath (which you can use with your regular language) than to either XQuery or XSLT.
Then again, if you find yourself processing XML a lot, and especially if your default language doesn't have anything as nice as Ruby's Nokogiri, XQuery might be worth checking out.
Code
All code examples for the series will be in this repository.
Top comments (0)