Here are four data sets (each composed of three subsets) from our paper Boosting for Regression Transfer. Each group of three subsets, representing three related concepts, is created from a UCI data set as described in the paper:
For each data set,
we identify a continuous feature that has a moderate
degree of correlation (around 0.4) with the label. We
then sort the instances by this feature, divide the set in
thirds (low, medium, and high), and remove this feature from the resulting sets. By dividing based on a
feature moderately correlated with the label, we hope
to produce three data sets that represent slightly different concepts.
Data is in the WEKA ARFF format.