Only R Studio Experts!
1. Actualize an author-to-author cheep bend book from the aboriginal abstracts set, stocktwit_graph_input.csv.
Create an bend book from the aboriginal abstracts set, stocktwit_graph_input.csv. We aloof charge two columns - antecedent (Vertex 1) and ambition (Vertex 2) of an bend to actualize a graph. Baddest all rows - tweets for columns K- “from_person” and M – “to_person” (or J and L for after columnist IDs) and save it as “stocktwit_from_to” or addition name you prefer.
2. Use Gephi to accomplish and save columnist (node) metrics. Baddest the metrics you like to analyze and use for architecture models later. Include at atomic 5 altered metrics. Save the metrics in a book alleged as stocktwit_node_yourname.csv. Submit this file. Include answers to the afterward questions in HW6_yourname.doc for submission.
a. Which three authors accept the accomplished betweenness centrality?
b. Which three authors accept the accomplished absolute degree?
c. Which three authors accept the accomplished closeness?
3. Body the Bulge Table for Prediction
(1). Open the stocktwit_node.csv book in Excel, and actualize a new variable: Able (i.e. suggested). It is the ambition variables we aim to allocate or predict.
(2). Do not abutting the stocktwit_node.csv file. Open the stocktwit_graph_input.csv file. And again go to the stocktwit_node.csv.
(3). Note that the assemblage in the stocktwit_node.csv book is a bulge (i.e. anniversary alone author) and the assemblage in the stocktwit_graph_input.csv book is a cheep (i.e. anniversary message). So, in adjustment to alteration the amount of able from the table of stocktwit_graph_input to the stocktwit_node table, we charge to do abstracts transformation.
To Expert, we charge to accredit one amount to one columnist (i.e. whether they are able or not – 1 stands for yes; 0 stands for no.).
Use the VLOOKUP action to accredit the amount of “suggested” from the table of stocktwit_graph_input to the column, “Expert”, in stocktwit_node table. The action for the aboriginal row should be like this:
= VLOOKUP(A2, stocktwit_graph_input.csv!$K$1:$AB$38200,18,FALSE),
where “A2” is the bulge name; “stocktwit_graph_input.csv!$K$1:$AB$38200” is the table ambit we attending up; 18 is the cavalcade cardinal from the table ambit that we aim to acknowledgment the value, “FALSE” stands for an exact bout of the value.
(4). Save the stocktwit_node.csv file. BTW, you can annul those rows who accept missing amount in Expert, because these nodes alone arise in the “to_person” column, they do not accept tweets.
Use clarify action in excel to abolish the #NAs.
4. In R, body and appraise a allocation archetypal that uses the metrics in stocktwit_node_yourname.csv from footfall 2 as appearance to allocate authors into “expert” stocktwit columnist (i.e., “suggested”=1)” or not (“suggested”=0) which is the ambition characterization variable.
(1). Application a berry of 100, about baddest 60% of the rows into training (e.g. alleged traindata). Divide the alternative 40% of the rows analogously into two adjudicator test/validation sets (e.g., alleged testdata1 and testdata2).
(2). Body the timberline application the C50 action with absence settings.
(3). Accomplish predictions (i.e. estimations) of the ethics of the ambition capricious for the testing instances.
Generate a abashing cast that shows the counts of true-positive, true-negative, false-positive and false-negative predictions for both testdata1 and testdata2. Consider 1 as absolute class.
Generate seven achievement metrics - Accuracy (percent of all accurately classified testing instances), and attention (percent of instances predicted to accept a chic are accurate), anamnesis (also accurate positive) and F-measure (also F-score) of the two classes of expert.
(4). Would you acclaim application the appearance from arrangement assay to analyze experts in the Stocktwit community? Why or why not? Include answers to the afterward questions in HW6_yourname.doc for submission.
Order a unique copy of this paper