Internet advertisements
Display advertising This dataset represents a set of possible advertisements on Internet pages. The features encode the ...
2017
30/11
 
  Partecipanti 45 Sottomissioni 310  
 

This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image’s URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement (“ad”) or not (“nonad”).

(See also ad.DOCUMENTATION file)

1. original data: http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements

2. Sources:
(a) Creator & donor: Nicholas Kushmerick <nick@ucd.ie>
© Generated: April-July 1998

3. Past Usage:
N. Kushmerick (1999). “Learning to remove Internet advertisements”,
3rd Int Conf Autonomous Agents. Available at
www.cs.ucd.ie/staff/nick/research/download/kushmerick-aa99.ps.gz.
Accuracy >97% using C4.5rules in predicting whether an image is an
advertisement.

4. This dataset represents a set of possible advertisements on
Internet pages. The features encode the geometry of the image (if
available) as well as phrases occuring in the URL, the image’s URL and
alt text, the anchor text, and words occuring near the anchor text.
The task is to predict whether an image is an advertisement (“ad”) or
not (“nonad”).

Proporzione di predizioni corrette.
il punteggio parziale si basa su 200 osservazioni (a voi ignote) del test set,
il punteggio finale, sulle restanti 800 osservazioni.

Chiusura della competizione alle ore 23.59 del 10 giugno.

load(“train.Rdata”)

library(rpart)
mod=rpart(ad~.,data=train,method=“class”)

#tabella di confusione (errori apparenti)
table(train$ad,predict(mod,type = “class”))

#previsione delle risposte
load(“test.Rdata”)
yhat=predict(mod,newdata=test,type = “class”)

#file da usare nella submission
write.table(file=“mySubmission.txt”,yhat)


See also the text file ad. name for names details and ad.documentation for a better format dislay.


5. Number of Instances: 2279 (1957 0:nonads, 322 1:ads)

6. Number of Attributes: 1558 (3 continous; others binary; this is the
STANDARD encoding” mentioned in the [Kushmerick, 99].)
One or more of the three continous features are missing in 28%
of the instances; missing values should be interpreted as “unknown”.

7. See [Kushmerick, 99] for details of the attributes; in
“.names” format:

height: continuous. | possibly missing width: continuous. | possibly missing aratio: continuous. | possibly missing local: 0,1. | 457 features from url terms, each of the form “url*term1+term2…”; | for example: url*images+buttons: 0,1. … | 495 features from origurl terms, in same form; for example: origurl*labyrinth: 0,1. … | 472 features from ancurl terms, in same form; for example: ancurl*search+direct: 0,1. … | 111 features from alt terms, in same form; for example: alt*your: 0,1. … | 19 features from caption terms caption*and: 0,1. …

8. Missing Attribute Values: yes

9. Class Distribution: number of instances per class
1957 nonads, 322 ads.




test set test.Rdata
50 KB
train train.Rdata
90 KB
ad.names ad.names
30 KB
ad.documentation ad.DOCUMENTATION
2 KB
Per partecipare bisogna prima autenticarsi
# Nome Punteggio Prove Ultima prova
1 Marco Petretta FINALE 97.75% 10 25.04.2016
08:40
2 e.bertone1 FINALE 97.62% 1 23.04.2016
16:35
3 Davide Boschetto FINALE 97.62% 1 19.04.2016
23:07
4 m.malacarne2 FINALE 97.50% 19 26.04.2016
15:53
5 s.terragni3 FINALE 97.50% 2 26.04.2016
07:21
6 g.vacca FINALE 97.50% 12 30.08.2015
16:44
7 e.furfaro1 FINALE 97.50% 9 28.07.2015
16:11
8 Jacopo Rossini FINALE 97.25% 22 09.06.2015
22:03
9 AVON VALENTINO FINALE 97.12% 15 26.04.2016
13:01
10 martina.dossi93 FINALE 96.75% 7 09.06.2015
14:52
11 DavidePoggi(I TRE MOSCHETTIERI) FINALE 96.62% 4 24.04.2016
20:15
12 francesco.bizzotto.3 FINALE 96.62% 11 03.06.2015
19:19
13 Pugi Jacopo FINALE 96.50% 7 27.04.2016
16:14
14 davide.comerlati FINALE 96.50% 1 26.04.2016
08:57
15 filippo.scarpa.1 FINALE 96.50% 1 26.04.2016
09:08
16 edoardo.vignotto FINALE 96.38% 41 07.06.2015
13:20
17 rebecchi_n FINALE 96.38% 11 24.04.2016
20:35
18 livio.finos FINALE 96.38% 4 06.06.2015
22:44
19 cuge89 FINALE 96.38% 2 08.06.2015
09:34
20 davide.meneghetti.1 FINALE 96.38% 2 06.06.2015
11:48
21 emanuele.degani FINALE 96.38% 2 31.05.2015
17:40
22 boyuan.zhang FINALE 96.38% 1 23.04.2016
15:29
23 davide.cecchinato.3 FINALE 96.38% 1 26.04.2016
23:13
24 christian.colombo FINALE 96.25% 32 08.07.2015
10:59
25 Cristian Castiglione (I TRE MOSCHETTIERI) FINALE 96.00% 1 26.04.2016
13:07
26 giacomo.ceoldo FINALE 96.00% 19 05.06.2015
15:12
27 igor.artico FINALE 95.88% 17 10.06.2015
22:28
28 adc FINALE 95.88% 6 26.04.2016
12:00
29 damiano.costa FINALE 95.88% 2 26.04.2016
11:59
30 bergamin FINALE 95.88% 1 26.04.2016
11:51
31 gabriella.dipede FINALE 95.75% 5 03.06.2015
23:03
32 federicogarbin91 FINALE 95.38% 3 23.04.2016
16:53
33 simone FINALE 95.25% 2 25.07.2015
13:31
34 sonubi03 FINALE 95.12% 8 29.08.2015
11:50
35 solari.aldo FINALE 87.75% 19 27.04.2016
21:50