4.2.6.10.1. Queries with distinct

The following may happen when distinct operator is omitted in JPQL queries for screens containing lists of entities with enabled paging and in scenario where an unpredictable modification of the query can happen as result of applying a generic filter or access group constraints mechanisms:

  • If a collection is joined at database level, the dataset will contain duplicate rows.

  • On client level the duplicates disappear in the datasource as they are added to a map (java.util.Map).

  • In case of paged table, a page may show fewer lines than requested, while the total number of lines exceeds requested.

Thus, we recommend including distinct in JPQL queries, which ensures the absence of duplicates in the dataset returned from the DB. However, certain DB servers (PostgreSQL in particular) may take unacceptably long time to execute an SQL query with distinct, if the number of returned records is big (more than 10000).

To solve this, the platform contains a mechanism to operate correctly without distinct at SQL level. This mechanism is enabled by cuba.inMemoryDistinct application property. When activated, it does the following:

  • The JPQL query should still include select distinct.

  • DataManager cuts distinct out of the JPQL query before sending it to ORM.

  • After the data page is loaded by DataManager, it deletes the duplicates and runs additional queries to DB in order to retrieve the necessary number of rows which are then returned to the client.