FAQ

Several people have asked questions regarding the fact that some annotations "could be in either class of [a] context", so we thought it would be good to better explain this and what's going on here.

First of all, it's important to understand how the data was collected. Subjects wore the armband while they were going about their daily lives.  They timestamped times when they were doing certain activities, ideally near the begininng and end of the activity (but certainly within the activity).  There were many different annotations, at several different levels.  For example, someone's annotation day might look like:

7:30am armband goes on after shower
7:45am subject gets into car
7:48am timestamp for start of driving
8:15am timestamp for end of driving
8:19am subject sits down at desk
8:24am subject timestamps start of computer_work
9:14am subject timestamps end of computer_work, goes to bathroom
9:20am subject forgets to timestamp, but knows they were at work all day, so annotates this as the start of office_work
5:30pm subject timestamps end of office_work, forgets to timestamp the drive home, so doesn't annotate it.
6:24pm subject timestamps start of exercise_stationary_bike
6:44pm subject timestamps end of exercise_stationary_bike
6:50pm subject timestamps beginning of general_exercise (tae kwon do class, which isn't a possible annotation)
7:10pm subject timestamps end of general_exercise
9:25pm subject timestamps beginning of watching_tv
11:30pm subject timestamps end of watching_tv
12:01am subject timestamps beginning of lying_down
7:30am subject timestamps end of lying_down and removes armband

Overall, in this 24 hours, there will be 1440 minutes of data, all with some annotation.  Most minutes will be annotated as unlabeled. The others will be annotated as one of {driving, computer_work, office_work, exercise_stationary_bike, general_exercise, watching_tv, lying_down}.

Now let's look at two possible target classes (which are NOT the target classes in the contest, by the way).  Let's say that the two target classes were computer work and stationary biking.  
For stationary bike, the exercise_stationary_bike annotation clearly counts as a positive example.  For the general_exercise, we won't know (as someone could annotate their entire trip to a gym as general_exercise which might include some time on the stationary bike), so it must count as unlabeled data for this class.  Everything else can count as negative examples.
For computer work, clearly the computer_work annotation counts as a positive example.  Office_work, however, overlaps with computer_work and isn't necessarily a positive or negative example, so must count as unlabeled.  The other annotations can be counted as negative examples.   So, from the web page:

     Positive examples of context 1: 3004
     Could be in either class of context 1: 0, 3003, 5199, 5101
     Negative examples of context 1: All other annotations.

Note that the instances with labels 0, 3003, 5199, and 5101 will appear in both the training AND test data.  The reason is to allow people to model the task as sequential.  For example, you (or your learner) may notice in the training data that label 3004 always appears a minute after label 3003.  So correctly identifying the instances with label 3003 might be useful.
For the purposes of scoring, though, the predictions for instances labeled 0, 3003, 5199, and 5101 will be ignored (though note that the learner will not know which instances those are).  The final score will be the overall accuracy over the positive and negative examples.
For example:
    instance   true label   prediction
1 3004 + (positive)
2 3003 +
3 0 - (negative)
4 13 +
5 3004 -

Here the learner would score 1/3.  Instance 1 is correct, instances 4 and 5 are incorrect, and instances 2 and 3 are ignored.
Hopefully this clarified the situation for some people.

Cheers,

= The PDMC organizers