-
Notifications
You must be signed in to change notification settings - Fork 4
/
data_eaman2.tex
29 lines (24 loc) · 4.35 KB
/
data_eaman2.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
Given the forum discussion data at the level of individual posts, we construct a network capturing the discussion patterns among users. The structural properties of this network form the basis of our analysis on a per-coin basis. In this network, nodes are the forum users and \textit{directed edges} point from posters within each thread to thread-initiators. The omission of edges based on simple co-appearance within a thread leads to a sparser network which isolates the communication patterns around ``dialogue-shapers''. The edges in the discussion network are weighted by the number of times a poster replies to a thread-initiator in different threads (i.e.~multiple replies by the same user within the same thread are counted only once).
% Is this too big of a statement?
In this context, edge weights capture the level of engagement thread initiators receive from the community and the amount of information a poster receives from thread initiators. Furthermore, our network construction method uses all the interactions since the inception of bitcointalk in creating new edges or updating their weights. The unlimited retention of any such (replier to thread initiator) interaction captures relevant information on seniority and community influence which are obtained through long-term and persistent presence in the forums.
%Prior to construction of the network, we merged posts from all forums into a
%single large forum since the community base of all five forums mentioned
%above is made of the same users and we are mostly concerned about influence and
%aggregate information flow among users, rather than the exact topic of the discussion.
To build our network, we first combine the posts from all forums. We do this because the community base of all five forums mentioned above is made of the same users, and because we are mostly concerned about influence and aggregate information flow among users, rather than the exact topic of the discussion.
The network construction involves replaying all the posts over time sorted by their date and updating the
discussion network accordingly. Whenever a new altcoin is introduced
in the forum for the first time, the user who introduced it and a snapshot of the network is taken.
We analyze the discussion network only up to the first time each coin is introduced to the community, in order to avoid any possible confounding between a coin's price movement and the extra attention it receives in the community due the same price changes. Our method uses the position of the first introducer in the network snapshot and the general structure of her neighborhood for extracting various network measures corresponding to that coin. Our final analysis examines these per-coin measures for evaluating the performance of each coin.
%TODO Do we use the date of the ANN or the date of the first trade?!?! NIKETE & EAMAN.
% E: Both of them are in the file, depends which one you are using.
%The identification of true introductions of new altcoins is a difficult process prone to many false-positives.
The majority of such introductions are made in the \textit{Announcement} forum and are preceded with the ``ANN'' tag. We look for the first mention of both the coin symbol \textbf{and} its descriptive name in the subject of a thread which contains the announcement tag. The first mentions of either the coin symbol \textbf{or} its name are used as a fall-back in case the more restrictive \textbf{and} requirement did not detect the first introduction of the coin.
%JULIAN: The above seems like a long winded way of saying that you just used "or". i.e., the above sentence parses to "(a and b) or (a or b)". Didn't want to overwrite in case I misunderstood.
% EAMAN: remove the text on OR requirement, if you did not or do not plan on falling back on OR mentions.
Using the more restrictive matching requiring both the coin name \textbf{and} symbol be present in the subject, we are able to detect the first introduction of 376 altcoins out of 679.
We can detect an extra 176 altcoins by falling back on the \textbf{or} requirement.
% Eaman: remove text in red if not falling back on OR.
The forum user who initiated such a thread is assigned as the introducer of the coin to the community.
Aproximately 500 of the 600 coins where manualy verfied in the forums by two of the authors to have correct identification.
%TODO(NIKETE): add validation results table wrt mapofcoins data here