Internet advertisements
Display advertising This dataset represents a set of possible advertisements on Internet pages. The features encode the ...
2017
30/11
 
  Partecipanti 45 Sottomissioni 310  
 

This dataset represents a set of possible advertisements on Internet pages. The features encode the geometry of the image (if available) as well as phrases occuring in the URL, the image’s URL and alt text, the anchor text, and words occuring near the anchor text. The task is to predict whether an image is an advertisement (“ad”) or not (“nonad”).

(See also ad.DOCUMENTATION file)

1. original data: http://archive.ics.uci.edu/ml/datasets/Internet+Advertisements

2. Sources:
(a) Creator & donor: Nicholas Kushmerick <nick@ucd.ie>
© Generated: April-July 1998

3. Past Usage:
N. Kushmerick (1999). “Learning to remove Internet advertisements”,
3rd Int Conf Autonomous Agents. Available at
www.cs.ucd.ie/staff/nick/research/download/kushmerick-aa99.ps.gz.
Accuracy >97% using C4.5rules in predicting whether an image is an
advertisement.

4. This dataset represents a set of possible advertisements on
Internet pages. The features encode the geometry of the image (if
available) as well as phrases occuring in the URL, the image’s URL and
alt text, the anchor text, and words occuring near the anchor text.
The task is to predict whether an image is an advertisement (“ad”) or
not (“nonad”).

Proporzione di predizioni corrette.
il punteggio parziale si basa su 200 osservazioni (a voi ignote) del test set,
il punteggio finale, sulle restanti 800 osservazioni.

Chiusura della competizione alle ore 23.59 del 10 giugno.

load(“train.Rdata”)

library(rpart)
mod=rpart(ad~.,data=train,method=“class”)

#tabella di confusione (errori apparenti)
table(train$ad,predict(mod,type = “class”))

#previsione delle risposte
load(“test.Rdata”)
yhat=predict(mod,newdata=test,type = “class”)

#file da usare nella submission
write.table(file=“mySubmission.txt”,yhat)


See also the text file ad. name for names details and ad.documentation for a better format dislay.


5. Number of Instances: 2279 (1957 0:nonads, 322 1:ads)

6. Number of Attributes: 1558 (3 continous; others binary; this is the
STANDARD encoding” mentioned in the [Kushmerick, 99].)
One or more of the three continous features are missing in 28%
of the instances; missing values should be interpreted as “unknown”.

7. See [Kushmerick, 99] for details of the attributes; in
“.names” format:

height: continuous. | possibly missing width: continuous. | possibly missing aratio: continuous. | possibly missing local: 0,1. | 457 features from url terms, each of the form “url*term1+term2…”; | for example: url*images+buttons: 0,1. … | 495 features from origurl terms, in same form; for example: origurl*labyrinth: 0,1. … | 472 features from ancurl terms, in same form; for example: ancurl*search+direct: 0,1. … | 111 features from alt terms, in same form; for example: alt*your: 0,1. … | 19 features from caption terms caption*and: 0,1. …

8. Missing Attribute Values: yes

9. Class Distribution: number of instances per class
1957 nonads, 322 ads.




test set test.Rdata
50 KB
train train.Rdata
90 KB
ad.names ad.names
30 KB
ad.documentation ad.DOCUMENTATION
2 KB
Per partecipare bisogna prima autenticarsi
# Nome Punteggio Prove Ultima prova
1 Cristian Castiglione (I TRE MOSCHETTIERI) PARZIALE 97.00% 1 26.04.2016
13:07
2 Jacopo Rossini PARZIALE 96.50% 22 09.06.2015
22:03
3 m.malacarne2 PARZIALE 96.50% 19 26.04.2016
15:53
4 AVON VALENTINO PARZIALE 96.50% 15 26.04.2016
13:01
5 marco.petretta (I TRE MOSCHETTIERI) PARZIALE 96.50% 10 25.04.2016
08:40
6 DavidePoggi(I TRE MOSCHETTIERI) PARZIALE 96.50% 4 24.04.2016
20:15
7 s.terragni3 PARZIALE 96.50% 2 26.04.2016
07:21
8 giacomo.ceoldo PARZIALE 96.00% 19 05.06.2015
15:12
9 g.vacca PARZIALE 96.00% 12 30.08.2015
16:44
10 francesco.bizzotto.3 PARZIALE 96.00% 11 03.06.2015
19:19
11 e.furfaro1 PARZIALE 96.00% 9 28.07.2015
16:11
12 martina.dossi93 PARZIALE 96.00% 7 09.06.2015
14:52
13 e.bertone1 PARZIALE 96.00% 1 23.04.2016
16:35
14 christian.colombo PARZIALE 95.00% 32 08.07.2015
10:59
15 igor.artico PARZIALE 95.00% 17 10.06.2015
22:28
16 edoardo.vignotto PARZIALE 94.50% 41 07.06.2015
13:20
17 rebecchi_n PARZIALE 94.50% 11 24.04.2016
20:35
18 sonubi03 PARZIALE 94.50% 8 29.08.2015
11:50
19 Pugi Jacopo PARZIALE 94.50% 7 27.04.2016
16:14
20 livio.finos PARZIALE 94.50% 4 06.06.2015
22:44
21 cuge89 PARZIALE 94.50% 2 08.06.2015
09:34
22 davide.meneghetti.1 PARZIALE 94.50% 2 06.06.2015
11:48
23 emanuele.degani PARZIALE 94.50% 2 31.05.2015
17:40
24 boyuan.zhang PARZIALE 94.50% 1 23.04.2016
15:29
25 davide.cecchinato.3 PARZIALE 94.50% 1 26.04.2016
23:13
26 davide.comerlati PARZIALE 94.50% 1 26.04.2016
08:57
27 BOSCHETTO DAVIDE PARZIALE 94.50% 1 19.04.2016
23:07
28 filippo.scarpa.1 PARZIALE 94.50% 1 26.04.2016
09:08
29 gabriella.dipede PARZIALE 94.00% 5 03.06.2015
23:03
30 simone PARZIALE 94.00% 2 25.07.2015
13:31
31 adc PARZIALE 93.50% 6 26.04.2016
12:00
32 damiano.costa PARZIALE 93.50% 2 26.04.2016
11:59
33 bergamin PARZIALE 93.50% 1 26.04.2016
11:51
34 federicogarbin91 PARZIALE 93.00% 3 23.04.2016
16:53
35 solari.aldo PARZIALE 80.50% 19 27.04.2016
21:50