Instructor

Carolina Salge: MWF 11:15 - 12:05pm (Sanford Hall 212)

Final Exam: Fri., Dec. 9, 12:00 - 3:00pm

Course Description

This course provides the skills necessary to conceptualize, build, and implement systems utilizing business intelligence in organizations. Topics include big data, executive information systems, dashboards and scorecards, machine learning, text mining, and mapReduce. The course is divided into two sections: (1) descriptive and (2) predictive analytics.

The course syllabus is a general plan for the course; deviations announced to the class by me (the instructor) may be necessary.

Prerequisites, Corequisites

MIST 4610

Objectives

Students completing this course will

  1. become familiar with BI concepts and frameworks
  2. be able to select and use a variety of BI software products
  3. be able to learn about big data and analytics
  4. be aware of business uses and values of BI
  5. be aware of future trends and directions for BI
  6. be familiar with career opportunities in BI and analytics

Topics

  1. Big data
  2. Executive information systems
  3. Dashboards and scorecards
  4. Data visualization
  5. Machine learning
  6. Text mining
  7. MapReduce

Text

There is no textbook for this course. Links to required readings are on the schedule. Videos are available on the Teradata University Network. The password is analytics (not case sensitive).

Software

We will use MicroStrategy (manual) for descriptive analytics. We will then use Tableau (training), SAS Visual Analytics (documentation), IBM Watson Analytics (getting started), and BigML (video) for predictive analytics.

R & DataCamp

R is an open-source software environment for statistical computing and graphics, and RStudio is the interface to R. Download the latest versions for your operating system since you will be required to complete a number of online lessons to learn R at DataCamp. While the introductory lesson is free, you will be required to complete some courses that are not. You should expect to pay somewhere between $18-27 to complete these courses. This assignment is not a group project; all your work should be conducted individually and without consultation with any other students in the class.

Each one of you should:

  1. Sign up for an account on DataCamp using your full name and your UGA email address.
  2. Join the MIST 5620 - Fall 2016 team using this invite link.
  3. The Intro course is free but you will need to purchase a subscription to complete the remaining courses.
  4. Extra credit opportunity: Complete the Data Manipulation with dplyr, the Big Data Analysis with Revolution R Enterprise, and the Introduction to Machine Learning modules for 1% extra credit each (total of 3%), added to your final grade.

Reminder:If you do not plan to continue studying R with DataCamp when you conclude the course requirements, please remember to cancel your subscription.

Identifier Course Description Duration
DC1 Introduction to R Write your first R code and discover vectors, matrices, data frames and lists Approximately 4 hours
DC2 Intermediate R Learn about conditional statements, loops and functions to power your R scripts Approximately 6 hours
DC3 Data Visualization in R with ggvis Use the grammar of graphics to create (interactive) graphics of your data Approximately 4 hours
DC4 Reporting with R Markdown Create dynamic html documents, presentations, and PDF reports in R Approximately 3 hours

Group Size

Groups should contain three-four persons.

Academic Honesty

As a University of Georgia student, you have agreed to abide by the University's academic honesty policy, "A Culture of Honesty, " and the Student Honor Code. All academic work must meet the standards described in "A Culture of Honesty." Lack of knowledge of the academic honesty policy is not a reasonable explanation for a violation. Questions related to course assignments and the academic honesty policy should be directed to me.

Team Work

In this class, you will work in teams. As a result, review a short report on team effectiveness and establish a team agreement (sample agreement). Give me a copy of your team agreement by Aug-26.

Freeloader Policy

It occasionally happens in class and enterprise settings that someone in a group is not prepared to do his/her share. In the case of my classes, I recommend that the team give the freeloader one warning and then fire that person from the team. That person will then do group assignments individually or find another team to join. The team should notify me of the change in team composition immediately. I distribute a form to assess team participation at the end of the semester. If a major disparity in team contribution is reported, I will adjust team project grades.

Laptop Policy

Students are welcome to use laptops in class for note taking and completing class exercises, exclusively. If you plan to take notes, please advise and email a copy of the notes at the end of each class.

Attendance

Attendance and participation are required for this course. Excessive unexcused absences (i.e., greater than 4) will result in a Drop or Withdrawal for Non-Attendance according to UGA policy.

Assignments

See the class schedule for the due date. The due time is 11:59pm on the due date.

Exercises

:
Identifier Topic Exercise
A1
(Rubric)
Scorecard
(Sample)
Create a balanced scorecard for yourself. In particular, identify three to six major perspectives in your personal and professional life that need to be carefully monitored. Then pick one of the major perspectives (e.g., doing well physically) for more careful analysis and identify component parts (e.g., getting healthy) and specific metrics (e.g., dropping weight) that are relevant. It should be clear how all of the metrics are measured and calculated.
A2
(Rubric)
MicroStrategy Web MMT
(Sample)
Access the web version of MicroStrategy (through the Teradata University Network) to learn how to use the software’s reporting and analysis capabilities. Begin by registering (use your full name as your ID). Complete all of the training modules and pass the test for each module with a score of 100%. Continue to take the tests until you score 100%.
A3
(Rubric)
MicroStrategy Mobile BI
(Sample)
It has been suggested that the movement to mobile BI is as significant as the evolution of client/server computing to web-based applications. It places BI in the hands of users wherever they are through a variety of smart devices. The major BI vendors recognize this movement and have added mobility to their products; that is, mobile devices connect users to the vendors’ BI platforms to access reports, graphs, dashboards/scorecards, and specialized applications. MicroStrategy is a leader in mobile BI. The first step in this assignment is to watch a video about MicroStrategy mobile BI here. While watching this video, answer the following questions:
  • How can a pharmaceutical sales rep use mobile BI?
  • Who is the key user of mobile BI at the Container Store?
  • How can the wait time for an app to open be customized?
The next part of this assignment is to experience mobile BI on your smart phone or tablet computer. First you will need to download the MicroStrategy mobile BI app. Next, open the app and find the employee benefits tab. Assume you are Thomas Smith and answer the following questions:
  • In what city do you live?
  • How many vacation/sick days do you currently have?
  • What’s your gross pay per month?
A4
(Rubric)
Tableau
(Sample)
Assume you are a marketing manager in Superstore and you have a sense that there are profitability issues in your products. You don’t know exactly how to define the problem nor what factors contribute to the issues. But, you want to explore this situation by visualizing the data you’ve received from those kind folks in IT. Import the Coffee Chain dataset and begin to explore by asking:
  • What products are under performing? What correlates with profit?
  • Are there issues with certain product lines, products, markets, pricing structures (margins), and costs?
There are multiple answers. You’ve been given data and you need to find where the problems are. In your report, provide an explanation of what you discovered containing screen shots from your analysis to show the logic of how you reached your conclusions. Convince me that you found the problem. With each screen shot define what question you are asking, what you observe, and why you went where you did next in your step-by-step problem exploration process. Treat this as a problem solving, treasure hunt, or business case. You may want to watch this Tableau Getting Started Video as you complete the assignment.
A5
(Rubric)
SAS Visual Analytics
(Sample)
Complete the first three SAS Visual Analytics assignments under Course Contents on the SAS VA homepage.
A6
(Rubric)
Text Mining
(Sample)
Create a Twitter account with a valid phone number. Go to Twitter App, login and create an application. Use the information from your account in RStudio to set up access to Twitter's API connection. Choose a topic of your interest and search for the most recent 10,000 tweets in English. Extract the text from your tweets and clean them by removing punctuations, numbers, stop words, and white space. In addition, transform the text to lower case and remove the keyword(s) included in your search. Finally, create a word cloud to get a crude idea of what is recently being said about your chosen topic on Twitter.

Note. You will need to install and use four different packages for this assignment: twitteR, RCurl, tm, and wordcloud.
A7 (Rubric) Dashboards
(Link to Dashboard)
(Sample)
Create a shiny dashboard with information about a topic of your interest. However, make sure that you:
  • Do not use Twitter data; we already covered that twice (class and A6). You are allowed though to fetch any other type of Internet data (e.g., Facebook, LinkedIn, Yelp, etc). In fact, you may want to check Kaggle, they have a lot of interesting data you can download and use.
When creating your dashboard, make sure to:
  • Have a title for your header, at least 2 sidebar menu items containing information, and also a minimum of 2 boxes/infoBoxes/tables in the body that also provide the viewer with some sort of information;
  • Learn how to share your dashboard for free using Shinyapps.io here. You will submit a pdf file with your compiled code (follow the format in A6) alongside a link to your dashboard.
This is your dashboard to create. Make the most out of it! You may want to check Shiny's Gallery for some example ideas and additional guidance.
A8 (Rubric) MapReduce
(Sample)
Using Delta’s performance data for February 2013 do the following:
  1. With regular R commands, compute the minimum, average, and maximum departure delay in minutes (DepDelayMinutes) for each origin airport. Use head() to show the first six rows.
  2. Use MapReduce to undertake the same computations. Test your code in local mode. Use head() to show the first six rows.
  3. What is the average departure delay in minutes for Atlanta?
Note. You will need to install and use ten different packages for this assignment. Follow the instructions in here.

State of the art presentations will be made commencing Aug 31 (group) (there will be one presentation per class).

A presentation is required from each group on a business intelligence software, with a particular concentration on open-source products.

Some suggested softwares follow, and you can propose others by contacting me. You should submit your bid for a software via e-mail. When submitting a bid specify your team's name. Those who bid early present early.

Software Presentation Date
Apache Zeppelin (pdf) Sept 26
TIBCO Spotfire (ppt) Sept 16
SAP's Lumira (ppt) Sept 9
Planners Lab  
DQ Analyzer (pdf) Sept 28
Amazon QuickSight (ppt) Sept 12
Amazon ML (ppt) Sept 7
QlikView (ppt) Sept 23
Information Builder's WebFOCUS  
Logi Vision (ppt) Sept 30
Microsoft PowerPivot (ppt) Sept 21
Google BigQuery (ppt) Sept 2
Google Prediction (ppt) Aug 31
PredicSis (ppt) Sept 19

IBM Watson Analytics Project (Group) - 20 points

Select a project of your own choosing. This requires developing an appropriate application, (i.e., being able to explain or predict a phenomenon), preparing and analyzing the data, creating a model (if a predictive application), and telling about your analysis in a story (presentation to the class). The more challenging the project, the better the opportunity for a top grade. Although data sets are available, more challenging projects identify and collect their own data, such as off the Internet.

Follow the guidelines for the IBM Watson Analytics project.

You should submit your bid for a presentation date via e-mail. When submitting a bid specify your team's name and chosen application topic.

Application Topic Presentation Date Order
Yelp Nov 28 1
TBA Nov 28 2
Uber Nov 28 3
TBA Nov 28 4
Game of Thrones Nov 30 1
Movies(IMDB) Nov 30 2
March Madness Nov 30 3
UGA Campus Transit Nov 30 4
TBA Dec 2 1
Sports Dec 2 2
Chipotle Dec 2 3
UGA Parking Services Dec 2 4

BigML Project (Group) - 10 points

Part A. In the 2012 Presidential Election, Obama and the Democrats received considerable recognition for the use of analytics to understand the electorate and get out the vote. Use the 2012 POTUS Winner by County dataset in BigML to answer the following questions:

Part B. Churn is a major problem for telecommunications firms. It is not unusual for 20 percent of a company’s customers to not renew their contracts. Because of this, Telcos are using analytics to identify customers that are most likely to churn so they can intervene to try to influence these customers to stay, such as providing attractive offers to renew or the promise of better service through a new cell tower. Use the Telco churn dataset to develop a model to predict churn. Exclude any variables that are unlikely to be related to churn and provide the logic behind your thinking. Also check for any variables that need to be recategorized from numerical to categorical and discuss why.

Part C. Using one of the other datasets provided on the BigML site, create a model, and then develop questions that you answer using the Prediction feature. In designing your model, first build it using part of the data (say 80%) and then test it using the remaining data.

Follow the guidelines for the BigML project.

Professional Development

There are two components: (1) Completion of the Arch Ready Professionalism Certificate, which requires attending five events offered by the UGA Career Center or the Terry College of Business, and (2) attending two SMIS meetings.

If you have potential conflicts with meeting the professional development requirements or if you think that there are better development activities for your situation, meet with me to discuss the possibilities. This meeting must be at the start, and not the end, of the semester and is your responsibility to schedule.

Pop Quizzes

There will be ten pop quizzes to test your preparation and understanding of the reading assignments. Quizzes will assess critical points in the readings and will never cover unimportant details. It is critical that you read each and every article prior to class time.

Grading

Item Points
Professional Development 3
State of the Art Presentation 7
Pop Quizzes 10
Exercises 10
Data Camp R 10
Midterm Exam 15
Final Exam 15
Software Projects 30
Total 100
If you are unable to complete an exercise on time or take an exam at the specified time, please advise me as soon as possible so that alternative arrangements can be made.

Schedule - Fall 2016

Class Day Date Readings / Videos Assignment Resubmission
1 Friday 8/12 AMCIS Conference    
2 Monday 8/15 Syllabus    
3 Wednesday 8/17 Career in BI (ppt)
    Watson. H. 2015. Should You Pursue a Career in BI/Analytics? Business Intelligence Journal.
   
4 Friday 8/19 Case Study Session: The BI/Analytics Game (ppt)
    Singer. N. 2014. When a health plan knows how you shop. The New York Times.

    The Economist. 2010. A different game.
   
5 Monday 8/22 History of BI - Part One (ppt)
    Watson. H. 2009. Business Intelligence: Past, Current, and Future. AMCIS Tutorial.
    *Read only pages 488-498
   
6 Wednesday 8/24 History of BI - Part Two (ppt)
    Watson. H. 2009. Business Intelligence: Past, Current, and Future. AMCIS Tutorial.
    *Read only pages 499-506
   
7 Friday 8/26 Executive Information Systems (ppt)
    Watson, H. 2011. What Happened to Executive Information Systems? Business Intelligence Journal.
   
8 Monday 8/29 Dashboards and Scorecards (ppt)
    Brath, R. & Peters, M. 2004. Dashboard Design: Why Design is Important. DM Review.
   
9 Wednesday 8/31 The Balanced Scorecard (ppt)
    Kaplan, S. & Norton, D. 1992. The Balanced Scorecard--Measures that Drive Performance. Harvard Business Review.
   
10 Friday 9/2 Case Study Session: Search Term Analysis at Bloomingdales - Guest Speaker: Rick Watson    
11 Monday 9/5 Labor Day Holiday A1  
12 Wednesday 9/7 Introduction to MicroStrategy (ppt)
    Sallam, R. L., Tapadinhas, J., Parenteau, J., Yuen, D., & Hostmann, B. 2014. Magic Quadrant for Business Intelligence and Analytics Platforms. Gartner Magic Quadrant Report.
    *Read only pages 1-4 and also page 16
   
13 Friday 9/9 Case Study Session: Sports Analytics (ppt) - Guest Speaker: Karim Jetha
    Burke, B. 2014. What is Football Analytics? Advanced Football Analytics blog.

    Ross, T. F. 2015. Welcome to Smarter Basketball. The Atlantic.

    MIT Technology Review. 2015. How Network Theory is Revealing Previously Unknown Patterns in Sports. Emerging Technology from the arXiv.

    Blum, R. 2015. The Big Shift: Infields Spin In Response to Data Explosion. AP/New York Times.
   
14 Monday 9/12 Mobile BI - Part One (ppt)
    Watson, H. 2014. Tutorial: Mobile Business Intelligence. Communications of the Association for Information Systems.
    *Read only pages 1-10
A2  
15 Wednesday 9/14 Mobile BI - Part Two (ppt)
    Watson, H. 2014. Tutorial: Mobile Business Intelligence. Communications of the Association for Information Systems.
    *Read only pages 11-23
   
16 Friday 9/16 Case Study Session: Mobile BI (ppt)
    Watson, H., Wixom, B., & Yen, B. 2013. Delivering Value Through Mobile Business Intelligence. Business Intelligence Journal.

    Briggs, L. L. 2011. Apparel Company App Melds Fashion, Mobile BI. Business Intelligence Journal.

    Watson, H. & Leonard, T. 2011. US Xpress: Where Trucks and BI Hit the Road. Business Intelligence Journal.
   
17 Monday 9/19 Introduction to Tableau - Part One (zip file) (ppt)
    Sallam, R. L., Tapadinhas, J., Parenteau, J., Yuen, D., & Hostmann, B. 2014. Magic Quadrant for Business Intelligence and Analytics Platforms. Gartner Magic Quadrant Report.
    *Read only page 25

    Hardin, M., Horn, D., Perez, R., & Williams, L. 2013. Which Chart or Graph is Right For You? Tableau Whitepaper.
A3 A2
18 Wednesday 9/21 Introduction to Tableau - Part Two (zip file) (ppt)    
19 Friday 9/23 Work on Tableau Assignment    
20 Monday 9/26 Selecting the Best BI Tool (ppt)
    Howson, C. 2007. Selecting the Best BI Tool. BIScorecard.
A4 A3
21 Wednesday 9/28 BI Project Approval (ppt)
    Watson, H. 2015. ROI: Getting Projects Approved. Business Intelligence Journal.
   
22 Friday 9/30 Midterm Exam Review: What Have We Learned Thus Far? (ppt)    
23 Monday 10/3 Midterm (Multiple Choice, True/False, & Fill in the Blanks) (sample)    
24 Wednesday 10/5 Midterm (Short Answer Essay) (sample)   A4
25 Friday 10/7 Midterm (Software Problem Application) (data)    
26 Monday 10/10 R and Tableau - Guest Speaker: Ben Daniel at HomeDepot (Skype Lecture) (Tableau/R Instructions) (zip file) (R example script)    
27 Wednesday 10/12 Midterm Results & Introduction to SAS Visual Analytics
    Sallam, R. L., Tapadinhas, J., Parenteau, J., Yuen, D., & Hostmann, B. 2014. Magic Quadrant for Business Intelligence and Analytics Platforms. Gartner Magic Quadrant Report.
    *Read only page 24
   
28 Friday 10/14 Data Warehouse - Guest Speaker: Yissel Cervantes at CTS (ppt)    
29 Monday 10/17 Introduction to IBM Watson Analytics (ppt)
    “Module 1: What Is Watson?” Available here
   
30 Wednesday 10/19 Big Data (ppt)
    McAfee, A. & Brynjolfsson, E. 2012. Big Data: The Management Revolution. Harvard Business Review.

    K.N.C. 2014. The Backlash Against Big Data. The Economist.

    Kolhatkar, S. 2016. Bluegrass and Big Data. The New Yorker.

    *TED Talk. Kenneth Cukier: Big Data Is Better Data.
A5  
31 Friday 10/21 Work on Watson Analytics Project    
32 Monday 10/24 Text Mining with R - Sentiment Analysis (ppt) (score.sentiment function)    
33 Wednesday 10/26 Text Mining with R - Word Cloud (ppt) (data)   A5
34 Friday 10/28 Fall Break    
35 Monday 10/31 Dashboards with R - The Basics (ppt)    
36 Wednesday 11/2 Dashboards with R - More Basics (ppt) A6  
37 Friday 11/4 Work on Watson Analytics Project - BigML (ppt)    
38 Monday 11/7 MapReduce - An Introduction (ppt)    
39 Wednesday 11/9 MapReduce with R - Installation
    Dean, J. & Ghemawat, S. 2008. MapReduce: Simplified Data Processing on Large Clusters. Communications of the ACM.
   
40 Friday 11/11 MapReduce with R - Application (ppt)    
41 Monday 11/14 Case Study Session: Social Bots (ppt)
    Ferrara, E., Varol, O., Davis, C., Menczer, F., & Flammini, A. 2016. The Rise of Social Bots. Communications of the ACM.

    MIT Technology Review. 2015. Fake Persuaders.

    Urbina, I. 2013. I Flirt and Tweet. Follow Me at #Socialbot. The New York Times.
   
42 Wednesday 11/16 Using R for Analytics at FirstData - Guest Speaker: Michael Anton A7 A6
43 Friday 11/18 Analytics Experiences with Fiserv - Guest Speaker: Bob Trotter    
44 Monday 11/21 Thanksgiving Holiday    
45 Wednesday 11/23 Thanksgiving Holiday A8 A7
46 Friday 11/25 Thanksgiving Holiday    
47 Monday 11/28 IBM Watson Analytics Project Presentations    
48 Wednesday 11/30 IBM Watson Analytics Project Presentations   A8
49 Friday 12/2 IBM Watson Analytics Project Presentations BigML Project  
50 Monday 12/5 Final Exam Review (data1) (data2) (data3) IBM Watson Analytics Project, DC1, DC2, DC3, DC4 & Extra Credit DataCamp Modules A1

Team evaluation

The form should be submitted by Dec 5.