Colocate Data from Different Partitioned Regions
By default, Geode allocates the data locations for a partitioned region independent of the data locations for any other partitioned region. You can change this policy for any group of partitioned regions, so that cross-region, related data is all hosted by the same member. Colocation is required for some operations, and it increases performance for others by reducing the number of data accesses to entries that are hosted on other cluster members.
Data colocation between partitioned regions generally improves the performance of data-intensive operations. You can reduce network hops for iterative operations on related data sets. Compute-heavy applications that are data-intensive can significantly increase overall throughput. For example, a query run on a patient’s health records, insurance, and billing information is more efficient if all data is grouped in a single member. Similarly, a financial risk analytical application runs faster if all trades, risk sensitivities, and reference data associated with a single instrument are together.
Procedure
Identify one region as the central region, with which data in the other regions is explicitly colocated. If you use persistence for any of the regions, you must persist the central region.
- Create the central region before you create the others, either in the
cache.xml
or your code. Regions in the XML are created before regions in the code, so if you create any of your colocated regions in the XML, you must create the central region in the XML before the others. Geode will verify its existence when the others are created and returnIllegalStateException
if the central region is not there. Do not add any colocation specifications to this central region. For all other regions, in the region partition attributes, provide the central region’s name in the
colocated-with
attribute. Use one of these methods:XML:
<cache> <region name="trades"> <region-attributes> <partition-attributes> ... <partition-attributes> </region-attributes> </region> <region name="trade_history"> <region-attributes> <partition-attributes colocated-with="trades"> ... <partition-attributes> </region-attributes> </region> </cache>
Java:
PartitionAttributes attrs = ... Region trades = new RegionFactory().setPartitionAttributes(attrs) .create("trades"); ... attrs = new PartitionAttributesFactory().setColocatedWith(trades.getFullPath()) .create(); Region trade_history = new RegionFactory().setPartitionAttributes(attrs) .create("trade_history");
gfsh:
gfsh>create region --name="trades" type=PARTITION gfsh> create region --name="trade_history" --colocated-with="trades"
- Create the central region before you create the others, either in the
For each of the colocated regions, use the same values for these partition attributes related to bucket management:
-
recovery-delay
-
redundant-copies
-
startup-recovery-delay
-
total-num-buckets
-
If you custom partition your region data, specify the custom resolver for all colocated regions. This example uses the same partition resolver for both regions:
XML:
<cache> <region name="trades"> <region-attributes> <partition-attributes> <partition-resolver name="TradesPartitionResolver"> <class-name>myPackage.TradesPartitionResolver </class-name> <partition-attributes> </region-attributes> </region> <region name="trade_history"> <region-attributes> <partition-attributes colocated-with="trades"> <partition-resolver name="TradesPartitionResolver"> <class-name>myPackage.TradesPartitionResolver </class-name> <partition-attributes> </region-attributes> </region> </cache>
Java:
PartitionResolver resolver = new TradesPartitionResolver(); PartitionAttributes attrs = new PartitionAttributesFactory() .setPartitionResolver(resolver).create(); Region trades = new RegionFactory().setPartitionAttributes(attrs) .create("trades"); attrs = new PartitionAttributesFactory() .setColocatedWith(trades.getFullPath()).setPartitionResolver(resolver) .create(); Region trade_history = new RegionFactory().setPartitionAttributes(attrs) .create("trade_history");
gfsh:
Specify a partition resolver as described in the configuration section of Custom-Partition Your Region Data.
If you want to persist data in the colocated regions, persist the central region and then persist the other regions as needed. Use the same disk store for all of the colocated regions that you persist.