/
/README
1 ## MaCaQue : Macros for Caml Queries
2
3 MaCaQue (or macaque) is a DSL for SQL Queries in Caml.
4
5 It has the following objectives :
6 - be highly compositional
7 - use expressive types to avoid runtime errors
8
9 macaque is still a work in progress, please send feedback to
10 `bluestorm dot dylc (gmail)`. I'm interested in bug reports and
11 reasonable feature requests.
12
13 Usual makefile provided (in src/) :
14 make, make install, bleh, make uninstall
15 Depends on PG'Ocaml
16
17 Summary of this document :
18 - [Important macaque types and structures](#macaque_types)
19 - [Sending queries to the SQL server](#sending_queries)
20 - [Processing query results](#processing_results)
21 - [Single value queries](#single_value_query)
22 - [General syntax : values, comprehensions, queries](#syntax_description)
23 - [View expressions](#view)
24 - [View results](#view_results)
25 - [View modifiers](#view_modifiers)
26 - [Comprehension items](#comprehension_items)
27 - [GROUP BY expressions and aggregate functions](#group_by)
28 - [Empty GROUP part](#empty_group)
29 - [Restrictions on the GROUP record, accumulators \[..\] syntax](#accumulators)
30 - [Static GROUP BY typing errors](#static_group_by_typing)
31 - [Non-grouped aggregates](#non_grouped_aggregates)
32 - [Description syntax](#descriptions)
33 - [Auto-incrementing columns](#incr_column)
34 - [Sequence descriptions](#sequence_descriptions)
35 - [Checking descriptions](#check_descriptions)
36 - [Sql functions, operators and data types](#SQL_operators)
37 - [Sql value operators](#value_operators)
38 - [Value types antiquotations](#value_antiquotations)
39 - [Column types](#column_types)
40 - [Sequence operators](#sequence_ops)
41 - [View antiquotations](#view_antiquotations)
42 - [phantom types](#phantom_types)
43 - [Specific type fourberies](#type_subtleties)
44 - [Sql interface safety](#interface_safety)
45 - [Update subtyping problem](#update_subtyping)
46 - [Remarks](#general_remarks)
47 - [Side effects](#side_effects)
48 - [Semantic of row bindings](#row_binding_semantics)
49 - [Camlp4 use](#camlp4_use)
50 - [Macaque and PG'OCaml](#pgocaml_compat)
51 - [NULL and semantic issues](#null_inference)
52
53
54
55 ### Important macaque types and structures [.macaque_types]
56
57 macaque builds SQL queries/views/values at three different levels :
58
59 ##### query
60
61 Macaque queries represent `SELECT`, `INSERT`, `DELETE` and `UPDATE`
62 queries. They can not be composed.
63
64 Examples:
65
66 - Select all rows from table `sql_table` whose `id` is inferior to `50` :
67
68 <:select< row | row in $sql_table$; row.id < 50 >>
69
70 - Delete all rows from `sql_table` whose `id` is superior to `50` :
71
72 <:delete< row in $sql_table$ | row.id > 50 >>
73
74
75 Type : macaques queries have type `'a Sql.query`, where `'a` is the
76 return type of the query : a row list for `SELECT` queries, unit
77 otherwise.
78
79 ##### view
80
81 Views represent SQL tables, views and (sub-)-`SELECT` results; view
82 are composable and can be used nearly anywhere a SQL table can.
83
84 Examples :
85
86 - User name and category of the user with id superior to `42`
87
88 <:view< {user = u.name; category = cat.name} |
89 u in $users$;
90 u.id > 32 + 10;
91 cat in $categories$;
92 cat.id = u.cat >>
93
94
95 - Names of all simple users of the 'prev_view' view (wich might be the previous example view)
96
97 <:view< {name = row.user} |
98 row in $prev_view$;
99 row.category = "simple user" >>
100
101
102 - Generic function that will project any view on its "name" column
103
104 let names view = <:view< {name = t.name} | t in $view$ >>
105
106
107 - Generic function that will build an associative table, wich the
108 'name' column in the 'name' field and the whole column in the
109 't' field.
110
111 let names view = <:view< {name = t.name; t = t} | t in $view$ >>
112
113
114 Type : `('a, 'b) Sql.view`, where `'a` is the type of a row
115 (considered as a value). `'b` is either `non_writable` or `'c
116 writable` : only writable views correspond to real tables in the SQL
117 database, and thus insert/update/delete won't accept non-writable
118 views. The `'c` type parameter in `'c writable` is useless for now, it
119 will eventually be used to transport information about default values
120 for table fields.
121
122 :Note:
123
124 views and select queries are distinct objects (view = reusable set
125 of rows, select = final query), but views are very common so
126 a specific library function is provided to directly send a view to
127 the SQL server, without having to manually build a select query
128 first.
129
130 ##### value
131
132 Values represent SQL values and expressions : `int`, `float`,
133 `string`, and results of SQL data functions and operators.
134
135 Examples :
136
137 - `2` (as a Sql Value, not a Caml int)
138
139 <:value< 1 + 1 >>
140
141
142 - `true` (as a Sql Value, not a Caml bool)
143
144 <:value< 1 = 1 >>
145
146
147 - field 'id' of table 'foo', casted as a nullable value (wich can take the `NULL` value)
148
149 <:value< nullable foo.id >>
150
151
152 - returns twice the given int, as a Sql value
153
154 let double (n : int) = <:value< 2 * $int:n$ >>
155
156
157 - returns the immediate row with column `foo` and `bar` of values `1` and `baz`
158
159 <:value< { foo = 1; bar = "baz" } >>
160
161
162 Type : `'a Sql.t`, the most pervasive type in macaque. `'a` is
163 actually a rich type containing the value type (int, string, a row,
164 etc.) but also several phantom types witnessing different properties
165 of the value (detailed description in the "phantom types" section of
166 this documentation). View rows are also values, and rows can be nested
167 at will :
168
169 let strange_pack foo = <:value< { a = 1; b = { c = 2; d = $foo$ } } >>
170
171
172
173
174 ### Sending queries to the SQL server [.sending_queries]
175
176 Most of macaque efforts are directed towards producing safe and
177 composables SQL queries, and rebuilding typed values from the
178 all-string result of SQL queries. The actual interaction with the SQL
179 server is relatively minor and well separated. It is done through the
180 `Query` module.
181
182 The `Query` module is designed to work with PGOCaml (and is actually the
183 only part of macaque with a hard dependence on PGOCaml). It thus
184 reproduces the functorized design of PGOCaml, wich is intended to
185 support flexible threading/concurrence models. See PGOCaml
186 documentation for more information. The casual user can use the
187 non-functorized values, wich are a directly-usable implementation of
188 the `QUERY` interface, with no threading integrated.
189
190 The core `Query` interface is quite simple :
191
192 val query : _ Db.t -> ?log:out_channel -> 'a Sql.query -> 'a Db.monad
193 val view : _ Db.t -> ?log:out_channel -> 'a Sql.view -> 'a list Db.monad
194 val view_one : _ Db.t -> ?log:out_channel -> 'a Sql.view -> 'a Db.monad
195 val view_opt : _ Db.t -> ?log:out_channel -> 'a Sql.view -> 'a option Db.monad
196
197 (in the simple case, `_ Db.t` is `_ PGOCaml.t`, and `'a Db.monad` is just `'a`)
198
199 The main function is `query`, wich accepts a database handler (the one
200 produced by `PGOCaml.connect`), the query, and returns the typed
201 result : in case of `SELECT`, a list of value rows, otherwise
202 (`UPDATE`, `INSERT`...) `unit`. An optional parameter `?log` will
203 output the query string in the given output channel, if any.
204
205 `view`, `view_one` and `view_opt` are given for convenience.
206 `view_one` expect exactly one result, and `view_opt` at most one
207 result; if they get more results, they will raise a `Failure` exception.
208
209 :Note:
210
211 The query module is very loosely integrated with the rest of
212 Macaque. In particular, all the typed result processing is done by
213 the `Sql` modules, through the `Sql.handle_query_results`
214 function. It makes the `Query` module inessential : an user willing
215 to experiment other interfaces to the SQL server could easily get
216 the SQL string itself (`Sql.sql_of_query`, `Sql.sql_of_view`), call
217 the SQL server in her preferred way, and give back the results to
218 `Sql.handle_query_results`.
219
220 Beware however that `Sql.handle_query_results` is unsafe, as it
221 strongly relies on results corresponding to the given query; giving
222 incorrect results (eg. coming from another query) results in an
223 undefined behaviour, even possibly a Segfault. You should really not
224 try anything clever except giving the query string to a PostGreSQL
225 server (in any way you like), and sending back the results to
226 Macaque.
227
228
229
230 ### Processing query results [.processing_results]
231
232 macaque produces `Sql.t` values from the query results. The real caml
233 values can be accessed using the `Sql.get` and `Sql.getn` operators :
234
235 val get : < get : _; nul : non_nullable; t : 't #type_info > t -> 't
236 val getn : < get : _; nul : nullable; t : 't #type_info > t -> 't option
237
238 `Sql.get` operates on non-nullable values, while `Sql.getn` operates
239 no nullable values and returns an `option`.
240
241 A common use case is to access fields from a row, leading to code such
242 as `Sql.get row#field`. The problem with this syntax is that it is not
243 chainable : to access field of a subrow (row inside the result row),
244 wich is a Sql.t value itself, one has to write code such as `Sql.get
245 (Sql.get row#sub_row)#field`.
246
247 To lighten the notation, a specific syntaxic sugar is integrated with
248 the `pa_bananas` syntax extension : `a#!b` is equivalent to `Sql.get
249 a#b`, and `a#?b` is equivalent to `Sql.getn a#b`. It is thus possible
250 to write `row#!sub_row#!field` directly.
251
252 Note that `a#?b` still returns an option, so you can't have something
253 like `a#?b#?c` (as `a#?b` is not an object) : you still need to do
254 pattern matching on the option value yourself. I could create an
255 unsafe constructor wich would raise an exception in the `None` case,
256 but that would defeat the point of macaque (as few runtime errors as
257 possible). I also feels that such facilities would encourage
258 nullability-laziness from the user : you have to work on your database
259 design upfront to eradicate nullable types were possible; in the case
260 were a `NULL` case is a semantic necessity, you will want the explicit
261 option matching anyway.
262
263
264
265 #### Single value queries [.single_value_query]
266
267 It is sometimes useful to query single values from the database,
268 instead of the usual list-of-rows SELECT query -- for example to
269 request the current time, or the value of a specific counter.
270
271 For this purpose, Macaque was extended with:
272
273 - A `Sql.value` function building a query from a `Sql.t` value; This
274 query, once run through `Query.query`, agains return a `Sql.t`
275 value, on which `Sql.get` or `Sql.getn` can be used depending on its
276 nullability.
277
278 - Convenience functions in Query to get the extracted value directly
279 without using Sql.get(n): `Query.value` for non-nullable value,
280 and `Query.value_opt` for nullable ones.
281
282 As an example, the following three definitions of `get_time` are equivalent:
283
284 let time = <:value< current_timestamp >>
285
286 let get_time dbh =
287 Sql.get (Query.view_one dbh <:view< {x = $time$} >>)#x
288
289 let get_time dbh =
290 Sql.get (Query.query dbh (Sql.value time))
291
292 let get_time dbh =
293 Query.value dbh time
294
295
296
297
298 ### General syntax : values, comprehensions, queries [.syntax_description]
299
300 macaque use the Camlp4 quotations system to integrate Domain-specific
301 syntaxes into the Objective Caml language. General principles are as follows :
302
303 ##### quotations
304
305 `<:foo< ... >>` is a quotation using the 'foo' syntax expander.
306 macaque provides 6 syntax expanders :
307 - 'select', 'insert', 'delete', 'update' : queries syntax expanders
308 - 'view' : view syntax expander
309 - 'value' : value syntax expander
310
311 'view' is the default syntax expander : `<< ... >>` is equivalent to `<:view< ... >>`
312
313 ##### antiquotations
314
315 Inside a quotations, `$foo$` is an antiquotation with
316 denotes a caml-level value :
317
318 let value v = <:value< $v$ >> (* the identity function on values *)
319
320 antiquotations can also be named in specific context :
321
322 let int n = <:value< $int:n$ + 0 >>
323 (* function that transforms a caml int into a Sql.t value *)
324
325 named antiquotations are not hardcoded in the syntax definition :
326 they rely on values of the `Sql` module, wich are in the appropriate
327 submodule : `$foo:bar$` will expand to `Sql.Module.foo bar`, where the
328 `Foo` module depends on the antiquotation position (`View`, `Value`,
329 `Table_type`); this way, on can read the `Sql` interface to discover
330 named parameters and their meaning; similarly, operators are defined
331 in `Sql.Op`.
332
333 ##### Loosely defined syntax
334
335 - camlp4 quotations symbols aren't represented
336 - `[foo]` means optional `foo` (one or none)
337 - `foo sep ...` means list of `foo`s separated by `sep` (";" or ",")
338
339 :BNF:
340
341 select ::= view
342 insert ::= table ':=' value [refinement]
343 delete ::= row name 'in' table refinement
344 update ::= row name 'in' table ':=' value refinement
345 view ::= view-result view-modifiers [refinement]
346 refinement ::= '|' comprehension-items ';' ...
347 view-result ::= value
348 | 'group' tuple ['by' tuple]
349 view-modifiers ::= ['order' 'by' order ',' ...] ['limit' value] ['offset' value]
350 order ::= value ['asc' | 'desc']
351 comprehension-item ::= row-binding
352 | guard
353 row-binding ::= row-name 'in' view
354 guard ::= boolean value
355 value ::= litteral constants
356 | 0-ary operator : 'null', 'current_timestamp'
357 | 1-ary operator : 'nullable' value
358 | infix binary operator : value '+' value
359 | tuple : tuple
360 | row '.' field : field-access
361 | row '?' field : default value for a field
362 tuple ::= '{' tuple-field ';' ... '}'
363 tuple-field ::= field-name '=' value
364 | field-access
365 field-access ::= value '.' field-name '.' ...
366
367 :Note:
368
369 A few remarks.
370
371 - ";"-separated lists allows an optional ending ";", to match OCaml syntax : {a;b;}
372
373 - 'table' denotes a place where an `'a table` type is required
374
375 - antiquotations can be used in 'view', 'table' and 'value' places.
376
377 You can use antiquoted expressions in all the places where
378 a 'view', 'table' (view) or 'value' (including rows, but not row
379 names or field names) is accepted
380
381 - Tuples, comprehension and guard lists can be empty :
382
383 - `<:view< $row$ | >>` is the view with only one line, wich is 'row'
384 - `<:insert< $table$ := $row$ | >>` inserts a single row
385 - `<:delete< row in $table$ | >>` deletes a whole table
386 - `<:update< row in $table$ := $value$ | >>` updates a whole table
387
388 In the 'view' and 'insert' cases, the ending "|" is optional. It
389 is however not the case for 'update' and 'delete' : see the
390 remark at the end of this section.
391
392 - In order to simplify the common
393 `{ a = foo.a; b = bar.foobar.b }` use case, tuple field names
394 are optional when the value is a field access : in that case,
395 the name of the accessed field is used (in case of nested
396 access, eg.`'foo.bar.baz`, the rightmost name is used, here
397 `baz`); the example can thus be written `{ foo.a; bar.foobar.b }`
398
399 - if the `BY` record of a `group .. by` expression is empty, the `BY` part
400 (`by {}`) is optinal. See the [GROUP BY expressions and aggregate
401 functions](#group_by) section.
402
403 ##### Remark : implicit exhaustive manipulation [.exhaustive_manipulations]
404
405 It was found that user sometimes forgot the guards in an `UPDATE` or
406 `DELETE` query. The results of that mistake are usually rather
407 displeasing (deleting or modifying the whole table). In order to make
408 the mistake more visible and less likely to happen, the "|" before the
409 guards of an `UPDATE` or `DELETE` query are not optional by default :
410 if the user forgets the guards altogether, she will also forget the
411 "|" and will get an error.
412
413
414
415 ### View expressions [.view]
416
417 :BNF:
418
419 view ::= view-result view-modifiers ['|' comprehension-item ';' ...]
420
421 #### View results [.view_results]
422
423 View results are either a simple value or a `GROUP BY` expression (see
424 the [GROUP BY expressions and aggregate functions](#group_by) part of
425 this document).
426
427 The usual return value is a SQL tuple: `{a = foo; b = bar}` will return
428 a view with columns named `a` and `b` and values corresponding to the
429 `foo` and `bar` expressions.
430 You can also give the name of a row : `<< t | t in .. >>`
431
432 Columns values can be tuples themselves. This is a quite
433 Macaque-specific feature, wich makes the language much more
434 homogenous. Tuple types (column or immediate records) can be nested
435 arbitrarily. For example, a cartesian product view :
436
437 let cartesian_product va vb = << {a = a; b = b} | a in $va$; b in $vb$ >>
438
439 Care is taken that tuple values support all operations the plain SQL
440 values support : Macaque works hard to translate them to expressions
441 the usual SQL server can understand (tuple flattening). This part of
442 macaque is rather delicate and you might encounter bugs with
443 no-yet-well-tested combinations of nested tuples and other macaque
444 features.
445
446 #### View modifiers [.view_modifiers]
447
448 View modifiers correspond to following SQL features : `LIMIT`,
449 `OFFSET` and `ORDER BY`. Their behaviour should not come as
450 a surprise : `ORDER BY` will sort the returned rows according to the
451 specified sorting criteria, `LIMIT` will restrict the number of
452 returned rows, and `OFFSET` will skip a given number of rows (for
453 example, `LIMIT 2 OFFSET 3` will return the 4th and 5th row of
454 a view).
455
456 `ORDER BY` syntax mimics the usual SQL syntax :
457 << ... order by foo asc, bar desc, ... >>
458
459 The ordering modifier (`asc` | `desc`) is optional, `asc` will be
460 choosed by default. If the ordering expression `(foo, bar)` is
461 a record (immediate tuple or row), no guarantee is given as to the
462 column ordering decided (fields of a macaque record are not ordered,
463 so a lexicographic order cannot be chosen) : the view result will be
464 sorted on each column of the tuple, with the given ordering modifier.
465
466 `ORDER BY` expressions are in the comprehension items scope : they can
467 depend on the rows bound in the latter part of the
468 comprehension. `LIMIT` and `OFFSET` are not : obviously `<< foo LIMIT
469 foo.id | foo in $..$ >>` is not a well-formed query.
470
471 #### Comprehension items [.comprehension_items]
472
473 There is one important thing to know about comprehension items \: row
474 bindings are not sequential, they're simultaneous (`let .. and .. and
475 .. in ..`) ! This is a not so happy feature, as the comprehension
476 syntax strongly suggest a sequential binding : there is an impedence
477 mismatch between the comprehension syntax and the SQL behaviour. For
478 further discussion, see the [Remarks > semantic of row
479 bindings](#row_binding_semantics) section of this manual.
480
481 Of course, guards can depend on the declared rows.
482
483
484
485 ### GROUP BY expressions and aggregate functions [.group_by]
486
487 The SQL query `SELECT fields GROUP BY group_fields` roughly translate
488 in the `group {fields} by {group_fields}` expression : after `group`
489 the aggregating part, and after `by` the grouping part.
490
491 The returned row are the concatenation of the `GROUP` record (`{fields}`)
492 and the `BY` records (`{group_fields}`).
493
494 Example :
495 << group {subtotal = sum[t.a]} by {k = t.b} | t in $view$ >>
496 Will expand to
497 SELECT SUM(t.a) AS subtotal, t.b AS k GROUP BY t.b FROM (...) AS t
498
499 #### Empty GROUP part [.empty_group]
500
501 `GROUP BY` expressions with all fields in the `BY` part are equivalent to
502 a `SELECT DISTINCT` query.
503
504 #### Restrictions on the GROUP record, accumulators [..] syntax [.accumulators]
505
506 The `BY` values (here `t.b`) can be any expression depending on the bound
507 view (here `t`, but possibly more than one), and anything else in
508 scope. The group values are restricted to match the `GROUP .. BY`
509 semantic :
510 - values depending on the bound tables of the view must be inside
511 "accumulators", that is square brackets :
512 - `<:view< group {c = t.c} by {k = t.b} | ... >>` is illegal (and won't type),
513 - accumulators must be used by an aggregate function, and nowhere else :
514 - `<:view< group {c = [t.c]} by {k = t.b} | ... >>` is illegal (and won't type),
515 - `<:view< group {c = count[t.c]} by {k = t.b} | ... >>` is legal
516 - fields name bound in the `BY` record can be used freely :
517 - `<:view< group {c = k; d = k + k; e = count[l]} by {k = t.b; l = t.c} | ... >>` is legal
518
519 #### Corresponding typing errors [.static_group_by_typing]
520
521 - Use of group-varying values outside accumulators
522
523 Error: This expression has type Sql.grouped_row
524 but an expression was expected of type
525 < nul : Sql.non_nullable; t : 'a #Sql.row_t; .. > Sql.t
526
527 Cause : use of a bound row in the `GROUP` record outside an accumulator
528
529 - Use of aggregate functions outside accumulators
530
531 Error: This expression has type < nul : 'a; t : 'b > Sql.group Sql.unsafe
532 but an expression was expected of type
533 < nul : 'c; t : 'd; .. > Sql.t Sql.unsafe
534
535 Cause : one of the field values is an accumulator expression;
536 accumulators should be consumed by aggregate functions and cannot be
537 returned directly
538
539
540 #### Non-grouped aggregates [.non_grouped_aggregates]
541
542 Aggregates functions *cannot* be used outside group by expressions. If
543 you want to use an aggregate functions over all the rows of a table,
544 use a group by expression with an empty BY record :
545
546 `SELECT max(t.id) FROM ...` -> `group {max = max[t.id]} by {} | ...`
547
548 In that case, `by` is optional : `group {max = max[t.id]} | ...`
549
550 Rationale : aggregate functions are not regular operators; they have
551 a non-trivial semantic, wich is modeled by the typing transformations
552 inside the group .. by expressions. Using them outside those
553 expressions would not be safe and could lead to runtime errors if the
554 SQL server doesn't accept an ill-formed query. It is a design goal of
555 macaque to protect from runtime errors.
556
557
558
559 ### Description syntax [.descriptions]
560
561 Description syntax is used to describe existing database tables, making
562 them as macaque views (internally, it builds a runtime description of
563 the data-base typing informations, to be used by further macaque
564 processing). The syntax is a far relative of `SQL CREATE TABLE`
565 statements, and can be discovered in the example base.ml
566 file. Currently, no other information that name, type, nullability and
567 default value are accepted.
568
569 <:table< recipes (
570 recipe bigint NOT NULL,
571 name text,
572 amount bigint NOT NULL
573 DEFAULT($ <:value< 0L >> $),
574 category text NOT NULL
575 DEFAULT($ <:value< "unknown category" >> $)
576 ) >>
577
578 This is not a table creation/specification tool : it does not free you
579 from the need of creating your tables in the database (with
580 a potentially richer description : foreign keys, etc.). If you change
581 a table description, you will have to duplicate the changes in the
582 caml description as well.
583
584 #### Auto-incrementing columns [.incr_column]
585
586 `SERIAL` columns are not supported (yet). It is however possible to
587 emulate this feature with explicit `SEQUENCE` manipulations ([sequence
588 operators](#sequence_ops) section) :
589
590 let my_table = <:table< my_sql_table (
591 id bigint NOT NULL DEFAULT(nextval $table_id_seq$),
592 ...
593 ) >>
594
595 At insertion site, one can then use the default access syntax to get
596 the next identifier:
597
598 <:insert< my_table := { id = my_table?id; ... } >>
599
600 #### sequence descriptions [.sequence_descriptions]
601
602 It is not possible to create new sequences from Macaque (this is
603 coherent with the choice that macaques allows descriptions, not
604 declarations), but macaque support sequence descriptions. Sequence
605 creation operators are part of the `Sql.Sequence` module.
606
607 let table_id_seq = <:sequence< serial "the_sql_sequence_name" >>
608
609 #### Checking descriptions [.check_descriptions]
610
611 To help you with the macaque/database synchronization, the `Check`
612 module provide coherence check routines (`check_table`,
613 `check_sequence`). It will raise errors if the caml-side description
614 is not faithful to the real table structure (as described by the SQL
615 server table informations). It is possible to enforce an automatic
616 runtime check of every macaque-described table or sequence with the
617 '-check_tables' command line option (camlp4 time), wich can be enabled
618 by the `use_check` ocamlbuild flag (see the [OCamlbuild](#ocamlbuild)
619 section of this document).
620
621
622
623 ### OCamlbuild [.ocamlbuild]
624
625 I have created a macaque-specific OCamlbuild plugin. It was originally
626 intended to help during macaque development only, but is probably
627 a valuable ressource if you want to use Macaque inside your
628 project. Besides the classical ocamlfind integration, you'll find the
629 `Sql_syntax` modules with support for the following tags :
630
631 - `use_macaque` : enable the macaque syntax extensions
632 - `use_check` : enable the `-check_tables` flag (see the [Decription
633 syntax > Checking table descriptions](#check_descriptions)
634 subsection)
635
636 You should generally use the `macaque` tag, the three other
637 preprocessing tags being there for finer-grained control.
638
639
640
641 ### Sql functions, operators and data types [.SQL_operators]
642
643 #### Sql value operators [.value_operators]
644
645 macaque can use all function and operators defined in the `Sql.Op`
646 module, using the standard ocaml syntax (and operator associativities
647 and precedences) :
648 - `<:value< $a$ + $b$ >>` is equivalent to `Sql.Op.(+) a b`
649 - `<:value< nullable $foo$ >>` is equivalent to `Sql.Op.nullable foo`
650
651 ##### aggregate function
652
653 `Sql.Op` has some aggregate functions. See the [GROUP BY expressions
654 and aggregate functions](#group_by) section of this document for more
655 information.
656
657 #### Value types antiquotations [.value_antiquotations]
658
659 macaque supports some SQL data types, some of them having a litteral
660 syntax (ints and string : `<:value< 2 >>`, `<:value< "foo" >>`). They can
661 all be constructed by using named antiquotations :
662 <:value< $bool:true$ >>
663 <:value< $float:cos 1.2$ >>
664
665 More generally, the type constructors are the values in Sql.Value :
666 `<:value< $foo:bar$ >>` is equivalent to `Sql.Value.foo bar`
667
668 #### Column types [.column_types]
669
670 Data types used in table descriptions are defined in the
671 `Sql.Table_type` module. We use a different set of type names, in
672 order to mimic SQL type names and ease specification derivation from
673 existing SQL tables. For example, "integer" is used instead of "int",
674 and will expand to a `Sql.Table_type.integer` value.
675
676 #### Sequence description
677
678 Sequence creators are defined in the `Sql.Sequence` module.
679
680 #### Sequence operators [.sequence_ops]
681
682 Sql.Op has some sequence functions. Currently supported are
683 `currval` and `nextval`. They can be used to have an
684 auto-incrementing identifier :
685
686 <:insert< $table$ := {id = nextval $table_id_seq$; .. } >>
687
688 See the src/tests/sequence.ml for example.
689
690 #### View antiquotations [.view_antiquotations]
691
692 In view positions, macaque supports antiquotations through the
693 `Sql.View` module : `$foo:bar$` will expand to `Sql.View.foo bar`.
694
695
696
697 ### phantom types [.phantom_types]
698
699 Sql.t types sure are heavy. You *will* be confronted to cluttered
700 error messages with ugly as hell unification problems. Hopes this
701 section helps.
702
703 Sql values are packed in a phantom type providing several information
704 about the value. It is an object type with a field `t` containing type
705 information, and other fields for value information.
706
707 Type information fields :
708
709 - 'typ' : the corresponding caml type (eg. `<typ : int>`); `Sql.null` has
710 a polymorphic `t` field, as `None` for option types
711
712 val null : < t : < typ : 'a; numeric : 'b >; nul : nullable; get : unit > t
713
714 - 'numeric' : allows for numerical operators overloading (see `Sql.Op.(+)` type)
715
716 Value information fields :
717
718 - 'nul' field : nullability information, is either `Sql.nullable` or
719 `Sql.non_nullable`. Constants from the `Sql.Data` operators have
720 a polymorphic `nul` field, so that they can be used in both
721 nullability context
722
723 - 'get' : just-built sql values (<:value< 1 >>, etc.) or values
724 retrieved from a SQL query are gettable : you can ask for their
725 corresponding caml value values using the two accessors Sql.get and
726 Sql.getn (get nullable) :
727
728 val get : < t : < typ : 'a; ..>; nul : non_nullable; get : _; .. > Sql.t -> 'a
729 val getn : < t : <typ : 'a; ..>; nul : nullable; get : _; .. > Sql.t -> 'a option
730
731 On the contrary, values built from SQL operators or field access can
732 not be transformed back into caml values without being first sent to
733 the SQL server as part of a query : they don't have a 'get' field.
734
735 Now, you probably understand why the error messages tends to be
736 slightly longer than usual. And all sql values carry such information,
737 including (nested) rows...
738
739
740
741 ### Specific type fourberies [.type_subtleties]
742
743 macaque being strongly (and intricately) typed, you will often find
744 yourself confronted to hostile error messages, wich means someone has
745 done something wrong. I hope that you will more often be wrong than
746 macaque. There are nonetheless some specificites that you should be
747 aware of.
748
749
750 #### Sql interface safety [.interface_safety]
751
752 Macaque syntax extensions transform user code into complicated caml
753 code. But the produced codes still lies outside macaque module
754 boundaries, and has access to the same information that the user
755 has. Concretely, as most of the operations needed by those extensions
756 are not typable inside the OCaml type system (for example dynamic
757 construction of an object based on a list of field names and values),
758 some unsafe operations had to be exposed through Sql interface^α.
759 They are marked with the "unsafe" parametrized type (wich is only
760 there for documentation purposes) and you should NOT use them : every
761 function having an "unsafe" type somewhere in its interface is to be
762 considered forbidden. All the other functions should be type-safe,
763 otherwise it's a bug.
764
765 :Footnote:
766
767 α: and there is some Obj magic behind the scene; but it's protected
768 by typing and you won't get a segfault, I hope.
769
770 #### Update subtyping problem [.update_subtyping]
771
772 An example of update syntax is << t in $tab$ := {amount = t.amout + 1} >>,-
773 wich increment the "amout" column of all rows in table `tab`. Table
774 `tab` probably has more fields than just the "amount" colum, so the
775 type of the tuple on the right should be a subtype of `table` row
776 type.
777
778 The problem is that polymorphic subtyping quantification is not
779 available in OCaml type system : subtype relations have to be
780 explicitely constructed from the object types. This is fine when, as
781 in the given example, the right tuple object type is known as
782 camlp4-time (an object with only one field 'amout'), but not when the
783 corresponding value is an antiquotation :
784
785 let update table value predicate =
786 <:update< row in $table$ := $value row$ | $predicate row$ >>
787
788 In this case, the update syntax is used to define a generic update
789 operation^β : I know of no syntaxic way to impose that the right
790 tuple type be a subtype of the table row type.
791
792 To keep things simple, there is an ad-hoc rule for the specific update
793 problem : when tuple type is known as camlp4-time, a subtyping
794 relation is used, but when there is a quotation, the two values types
795 are unified : the set tuple has to have exactly the same columns as
796 the table, no less.
797
798 This loss of generality could surprise the innocent user. In order to
799 prevent hair scratching, a warning is emitted at camlp4-time when an
800 antiquoted caml value is used for the update tuple. This warning can
801 be disabled using the -sql-nowarn-undetermined-update command line
802 option.
803
804 Of course, it is always possible to manually expand an antiquotated
805 value, when the fields name are known at development-time :
806 <:update< t in $table$ := $tup$ | ... >>
807 should be rewritten into
808 <:update< t in $table$ := { foo = $tup$.foo; bar = $tup$.bar } | ... >>
809
810 :Footnote:
811
812 β: notice how `value` and `predicate` are actually functions
813 depending on `row` values; this kind of thing can greatly improve
814 code factorization among your SQL queries
815
816
817
818 ### Remarks [.general_remarks]
819
820 #### Side effects [.side_effects]
821
822 It is probably an obvious thing to say, but users should not put
823 expressions wich have side-effect when evaluated inside macaque
824 expressions. No guarantee is given that any part of a macaque
825 expression will be evaluated at all, or only once.
826
827 In case of doubt, you should explicitely evaluate expressions before
828 handling them to the macaque expression :
829 let my_val = my_expr in << ... $my_val$ ... >>
830
831 #### Semantic of row bindings [.row_binding_semantics]
832
833 From the [View expressions > Comprehension
834 items](#comprehension_items) part of this document :
835 %
836 > There is one important thing to know about comprehension items : row
837 > bindings are not sequential, they're simultaneous (let .. and .. and
838 > .. in ..) ! This is a not so happy feature, as the comprehension
839 > syntax strongly suggest a sequential binding : there is an impedence
840 > mismatch between the comprehension syntax and the SQL behaviour.
841
842 The rationale behind this choice is purely pragmatic : SQL queries use
843 simultaneous bindings, and reproducing that choice much facilitate the
844 translation. It would be possible to have sequential bindings, but
845 that's somewhat more complicated, and I supposed it was unnatural to
846 the SQL query writer anyway.
847
848 I'm not quite satisfacted with this state of affairs, so it is
849 possible that I replace the current behaviour with sequential
850 declarations in the future. Please do not write code that would break
851 if that was the case : `<< .. | t1 in $..$; t2 in $ .. t1 $ >>` when
852 `t1` is already in the scope. Such trickery tremendously hurts
853 readability anyway, so you should not use it even if it wasn't for
854 compatibility reasons.
855
856
857 #### Camlp4 use [.campl4_use]
858
859 I am under the impression that some "serious" ocaml users try to avoid
860 to include camlp4 in their compilation chain if possible, and to
861 minimize their reliance on camlp4 extensions. Camlp4 is
862 a not-so-simple (and really-not-so-well-documented) tool, and camlp4
863 extensions are sometimes fragile and more subject to bugs than other
864 pieces of ocaml software; syntaxic bugs can have far-reaching effects
865 (they are, however, nearly always spotted at compilation-time) and are
866 difficult to spot for the non-camlp4-aware user.
867
868 Specific care has been taken in Macaque to make the camlp4 processing
869 as solid as possible :
870
871 - camlp4 processing is split in two independent extensions
872 (`pa_macaque`, `pa_bananas`) so that the user can make fine-grained
873 decisions as to what extension to activate for a specific
874 compilation
875
876 - `pa_macaque` relies on the "quotation" mechanism of the existing
877 ocaml camlp4 grammar : they DO NOT modify the ocaml grammar but
878 instead reuse the `<:..< .. >>` mechanism wich is widely used inside
879 camlp4 itself, and thus can be reasonably trusted
880
881 - `pa_bananas` is the only extension modifying the OCaml grammar
882 itself; it is a simple extension wich affects a localized part of
883 the parsing process (the "." level of expressions : `a.b`, `a#b`..),
884 with no precedence/associativity subtleties. It is a relatively
885 trustable extension, and any spotted defects would be easy to fix.
886
887 Moreover, `pa_bananas` is only a convenience extension and it is easy
888 not to use it (use `Sql.get`/`Sql.getn` instead of `#!`/`#?`). If a problem
889 were to arise, it would be easy for a user to mechanically replace
890 all `pa_bananas` syntax uses by camlp4-free equivalent expressions,
891 and drop `pa_bananas` from her compilation chain entirely.
892
893
894 #### Macaque and PG'OCaml [.pgocaml_compat]
895
896 Macaque relies on PG'OCaml low-level interface. It is fully compatible
897 with PG'OCaml : you can use a database handler for Macaque queries and
898 PG'OCaml queries at the same time (PG'OCaml stocks some private
899 information in those handler; Macaque doesn't access nor modify them).
900
901 PG'OCaml is more mature and its general approach leads to a safer
902 software : Macaque will probably catch less errors at compile-time
903 than PG'OCaml (see the PostGreSQL typing issues of the next section
904 for an area where I believe Macaque lacks behind PG'OCaml in terms of
905 safety), and it is very young software wich probably have quite a few
906 bugs remaining.
907
908 This inconvenient comes with the flexibility Macaque brings : it is
909 composable, and does not need the SQL server available at compile time
910 (or, more exactly, camlp4-time).
911
912 I also believe that PG'OCaml prepared queries model is a bit more
913 efficient than Macaque raw-SQL-queries output. I have done absolutely
914 no work on Macaque optimisation, and my priority are rather safety and
915 composability. If you're interested in investigating Macaque
916 performances, let me know.
917
918
919 #### NULL and semantic issues [.null_inference]
920
921 I have tried to reproduce SQL behaviour as closely as possible. In
922 particular, all operators are really SQL operators, so they will have
923 the SQL behaviour, for example `null = null` is null, not false.
924
925 I have encountered issues with PostGreSQL typing system
926 though. PostGreSQL gives NULL the type 'unknown', wich is problematic
927 because sql operators are strongly typed, but the type system is not
928 very expressive. For example `NULL + 0` will work okay, but :
929
930 base=> SELECT e.n + 0 FROM (SELECT NULL AS n) AS e;
931 ERROR: failed to find conversion function from unknown to integer
932
933 I have tried to work around those issues, but think some of them are
934 probably lurking somewhere. It is unfortunate as it undermines Macaque
935 static safety : these runtime errors are subtle and quite difficult to
936 prevent. Please report them if you find some. I hope not.