-
Notifications
You must be signed in to change notification settings - Fork 4
/
variables_eaman.tex
36 lines (29 loc) · 7.4 KB
/
variables_eaman.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
In this section, we discuss the various metrics extracted from discussion networks and used as independent variables in the regression analysis. Many of these variables are standard metrics in graph theory designed to capture node centrality is specific scenarios \cite{KleinbergNetworks}. As mentioned before, each coin is associated with a forum user and a discussion network which corresponds to the state of the forum at the time the user introduced the coin to the community. All of our node-level variables refer to the user introducing the coin. Below, we list the network variables included in the analysis. We used Python igraph implementation for computing the network-related metrics \cite{igraph}.
\begin{enumerate}[topsep=0pt,itemsep=-0.5ex,partopsep=1ex,parsep=1ex]
\item \textbf{Introducer number of posts:} The total number of posts (thread-initiations or simple replies) the coin introducer has made at the time she introduces the coin. It captures the user's level of activity in the community.
\item \textbf{Introducer number of threads:} The total number of threads the coin introducer has made. Users who start many threads are more likely to receive incoming edges and to shape the dialogue in the community.
\item \textbf{Seniority:} It is the number of days since the user's first post in the forums. We use this as a proxy for user's seniority in the community.
\item \textbf{Incoming degree:} The (incoming) degree centrality captures the role of dialogue-shapers in the community as it is the number of unique users who have replied to any of the focal user's threads.
\item \textbf{Outgoing degree:} The (outgoing) degree centrality captures the role followers in the community as it is the number of unique thread initiators the focal user has ever replied to.
\item \textbf{Total degree:} The (undirected) degree centrality captures the total level of the user's involvement in the community in any of the two forms above.
\item \textbf{Clustering Coefficient:} A measure of embeddedness or triadic closure, this is the fraction of the focal user's triads that are closed. In general, ideas are more likely to be reinforced and persistent in a triad if it is `tightly-knit'. The positive effect of triadic closure (and balance) on tie qualities and their persistence is shown to exist in online social network such as Twitter \cite{KleinbergBalance}, and we believe the same argument applies to our scenario.
%JULIAN: To be honest I'd do away with most of the above definitions for the social networks track, and just quickly list the features. A quick explanation of why they ought to be useful would be valuable, but definitions for these basic concepts is overkill.
\item \textbf{Weighted closeness centrality:} While degree centrality measures the level of user engagement in the community, it only examines the local structure around the user. In contrast, closeness centrality measures the level of the user's engagement with the global network \textit{either directly or indirectly}. It is relevant in many scenarios, including in online discussions, as information spreads via shortest paths. In our context, a user with high closeness centrality has interacted with a diverse set of users who themselves are close to a large set of diverse users.
For our analysis, we computed the incoming closeness centrality where only directed paths to the focal user are used. It measures closeness of the whole network to the coin introducer. Users who start many threads are likely to have higher incoming closeness centrality. The edges in the network are weighted to indicate the intensity or level of interaction between the two users. Their weights are determined by the frequency of interactions between two users; and as two users interact more, they are deemed to be closer in their shortest path. Thus in the computation of weighted closeness centralities, we use the reciprocal of the weights as the distance between two users.
\begin{equation}
C_{i} = \frac{N-1}{\sum_{j=1}^{N}\sum_{e \in S_{ij}} \frac{1}{w_{e}}}
\end{equation}
where $e$ denotes an edge in $S_{ij}$ the set of users on the shortest path from i to j. $w_e$ is the weight of edge $e$ determined by the number of interactions between the end points.
Finally, closeness centrality is normalized by the number of users present in the network at the time of coin introduction, so that the comparison between the closeness centrality of various users (who introduce the coins) at different times is valid.
%Our analysis consisted of three versions of the unweighted closeness centrality:
%\begin{enumerate}[topsep=0pt,itemsep=-0.5ex,partopsep=1ex,parsep=1ex]
% \item \textbf{Incoming:} Only the directed paths leading to the focal user are used. It measures closeness of the whole network to the focal user. Users who start many threads are likely to have higher incoming closeness centrality.
% \item \textbf{Outgoing:} Only the directed paths starting from the focal user to all the other users are used. It measures closeness of the focal user to the whole network. Users who reply to many threads are likely to have higher outgoing closeness centrality.
% \item \textbf{Undirected:} The paths both from and to the focal users are used. Users who initiate and reply to many threads are likely to have higher undirected closeness centrality.
%\end{enumerate}
%All measures are normalized by the number of users present in the network at the time of coin introduction, so that the comparison between the closeness centrality of various users (who introduce the coins) at different times is valid.
\item \textbf{Weighted betweenness centrality:} Betweenness is closely related to the theory of weak ties and structural holes and measures how well of a bridge is the focal user. In our context, one could interpret betweenness centrality as a generational bridge. Bitcointalk has been an active forum since early 2010, and many users who were once active in its early days are no longer present in the forum. There are however some early users who are still active on the forum. These users have high betweenness centrality as they act as generational bridges between founders and the newcomers to the community. Another standard interpretation of high betweenness centrality is the existence of users who simulatenously interact with two isolated communities in the forum. Similar to closeness centrality, our betweenness centrality computation uses the inverse of edge weights as the distance between two users.
\item \textbf{Satoshi distance:} Directed distance from Satoshi can be interpreted as founder effect. Closeness to or direct interaction with Satoshi constitutes as a form of social capital in the community. The maximum distance from Satoshi in our directed network is 6. In order to include this variable in our regression analysis, we used a distance of 7 for those users who had no possible path from Satoshi and instead added an auxiliary binary variable which indicates whether there is a path from Satoshi to the user.
\item \textbf{Weighted pagerank:} where higher weights measured by the frequency of interactions facilitate the flow.
\item \textbf{Weighted Satoshi pagerank:} is similar to regular pagerank above with the only difference that resets of the random walk always direct to Satoshi instead of a uniform distribution over all users. It can be interpreted as the level of influence or creditability allocated from the founder to other users.
\end{enumerate}