Add support containers in queries feature

author: Boris Kolpackov <boris@codesynthesis.com> 2014-10-27 11:44:14 +0200
committer: Boris Kolpackov <boris@codesynthesis.com> 2014-10-27 11:44:14 +0200
commit: b754d9e58997c4ad09894793d08b2d144cd598cc (patch)
tree: 5882ba8100cdcaafaf5a05b34c64a1cc6f122e2a /feature
parent: bf7bbdc018c649b04eba871b6590947c3a188c3a (diff)
2 files changed, 128 insertions, 0 deletions
diff --git a/feature/query/container b/feature/query/container
new file mode 100644
index 0000000..76c005e
--- /dev/null
+++ b/feature/query/container
@@ -0,0 +1,126 @@
+- Parts of a container query:
+
+  * selector -- selects which elements are examined
+  * predicate -- test applied on the selected elements
+  * quantifier -- counts how many selected elements satisfy the predicate
+
+  Selector and predicate use the same query syntax, e.g., (query::index < 10).
+
+  The most general quantifier is 'count' which simply returns the number
+  of elements that satisfied the predicate. We will also have "shortcut"
+  quantifiers for convenience (and optimization, in the case of 'all'):
+
+  any == (count != 0)
+  all == (count == size)
+  one == (count == 1)
+  none == (count == 0)
+
+  Note that while it may seem that selector and predicate are the same
+  thing, they really are not (see IMP operator).
+
+- The most promising syntax so far:
+
+  typedef odb::query<employer> query;
+  typedef query::employees_query emp_query; // employees_value,
+                                            // employees_element
+                                            // employees_type
+
+  query::employees[emp_query::index < 10].count (
+    emp_query::value.first == "John" &&
+    emp_query::value.last == "Doe") > 1;
+
+  The selector ([]) is optional. If not present, then defaults to 'all'.
+  Instead of 'count' we one can write 'any', 'all', 'one', or 'none'.
+
+  query::employees_query type is essentially a container element (or
+  value) type. For vector it would be:
+
+  struct
+  {
+    index;
+    value;
+  };
+
+  For a map it would be:
+
+  struct
+  {
+    key;
+    value;
+  };
+
+  The weakest part in this syntax is the emp_query typedef. We kind of
+  need it in order not to have to repeat it all the time. We need to
+  come up with a clean naming schema for these things (both for the
+  typedef inside query and the alias that the user gives it). For
+  simple queries it can be omitted, for example:
+
+  query::employees.any (
+    query::employees_element::value == name ("John", "Doe"));
+
+  It is conceptually correct that we don't say query::employees::value
+  because 'employees' is a whole container while what we refer to is
+  an element of a container.
+
+  This is also related to the mass UPDATE feature in the sense that
+  the whole "_query" naming schema will have to be changed since we
+  will want to write something like:
+
+  update ((?::age += 1), (?::name == "John"));
+
+  Keeping the "query" name and ending up with something like this
+  is most definitely a bad idea:
+
+  update ((query::ceo = true), (query::name == "John Doe"));
+
+  So we need some neutral name, something like "members":
+
+  typedef odb::members<employee> members;
+
+  update ((members::ceo = true), (members::name == "John Doe"));
+  query (members::name == "John Doe" && members::ceo);
+
+- empty(), size() -- these are properties of the container itself, not
+  its elements. Syntax:
+
+  query::employees.empty ()
+
+  This will probably be easiest to implement with an aggregate sub-query,
+  which is ok.
+
+  There is a way to implement empty() without a subquery using left
+  join.
+
+- count predicate; e.g., more than five employees are female.
+
+- For object queries will need DISTINCT. Container table is only used in
+  the where clause.
+
+- Joining containers in views. Here might need DISTINCT ON (not supported
+  in SQLite) but will probably have to be user controllable. Also in this
+  case the container table can be used in both select list and where clause.
+
+- From: http://www.codesynthesis.com/pipermail/odb-users/2014-January/001696.html
+
+  When people are using a container in a query condition, we need to know
+  which elements to consider. This can be some specific element (e.g., the
+  first element), any element, all elements, a range of elements, etc.
+
+  I think the "any element" will be the most widely used case and is the one
+  we definitely have to support. Others, I am not sure it will even be
+  possible to implement in SQL in any sane way (e.g., all elements, a range of
+  elements). Maybe what we should do is expose the index column (or the key
+  column for maps) to the user so that they can create whatever conditions
+  they want. Something along these lines:
+
+  query::authTokens.index == 0 && query::authTokens->hash == 123
+
+  [Note the problem with this syntax: a container element may also have a data
+   member named index.]
+
+  It is also not clear how to implement the "all elements" case with this
+  approach, or in SQL in a sane/portable way in general.
+
+- User examples:
+
+  http://www.codesynthesis.com/pipermail/odb-users/2011-September/000300.html
diff --git a/feature/query/list b/feature/query/list
index 4d09312..85be57d 100644
--- a/feature/query/list
+++ b/feature/query/list
@@ -1,3 +1,5 @@
+- Support containers in queries: container
+
 - Shortcut query() call for queries that always return one element
 
   Can be useful for aggregate queries, etc.
author	Boris Kolpackov <boris@codesynthesis.com>	2014-10-27 11:44:14 +0200
committer	Boris Kolpackov <boris@codesynthesis.com>	2014-10-27 11:44:14 +0200
commit	b754d9e58997c4ad09894793d08b2d144cd598cc (patch)
tree	5882ba8100cdcaafaf5a05b34c64a1cc6f122e2a /feature
parent	bf7bbdc018c649b04eba871b6590947c3a188c3a (diff)