Leveraging Natural Language Processing and Large Language Models for Research Exploration and Data Analysis

Dates: 12th July, 13th and 15th July ‘ 2024

Course Introduction / Description:

 Generative AI is a type of artificial intelligence that can create and generate new content. As part of Generative AI, Large Language Models [LLMs] are deep learning- based transformer architectures (e.g., GPT-4/Generative Pre-trained Transformer-4), which are considered a significant breakthrough in the Natural Language Processing (NLP) and AI field and have shown a substantial potential to transform organizations and society in several ways. An example of an LLM is ChatGPT, which has recently gained widespread attention for its exceptional language generation skills and has demonstrated tremendous capabilities across various domains and tasks such as question-answering and passing examinations (such as Uniform Bar Exam, etc.),  thereby even challenging our wisdom and cognition.

The primary purpose of this course is to provide knowledge and a deep understanding of various concepts, techniques, and methods that serve as a foundation for LLMs such as ChatGPT. Starting from the basic NLP concepts, this course will delve into deep learning architectures for NLP and Generative AI and then into applications and data analysis using LLMs. Subsequently, the course will focus on the opportunities, challenges, and risks associated with these Generative AI models like ChatGPT and  their implications for organizations and society. The following is the outline for the course.

 

The course is mainly designed for PhD students who want to use NLP and text analysis in their research using LLMs. It also contains hands-on exercises on these topics using the Python programming language. The PhD students are expected to have some basic understanding of either Python or R programming languages and some familiarity with running Python scripts using Jupyter Notebooks. The following is the course outline.

  1. First, the course starts with some fundemnetal concepts of machine learning (ML) and NLP. Additionally, it focuses on using these techniques for data analysis, using supervised and unsupervised approaches, such as text classification and topic
  2. Second, it presents the high-level architectures of deep learning, generative models, and LLMs and elaborates on why LLMs like ChatGPT have achieved so many analytical
  3. Third, it will provide a detailed account of these models' capabilities, possible applications to various fields, and how they will impact society and organizations in the
  4. Fourth, it will present how LLMs can be used for text analysis by examining some of the techniques, such as text summarization, text classification and code generation.
  5. Finally, it will discuss these models' diverse societal impacts and challenges, especially in terms of inequity, misuse, and legal and ethical

Course Learning Outcomes:

After completing this course, the participants should be able to:

  1. Demonstrate the fundamental understanding of NLP and how they can be used for the analysis of text
  2. Explain the fundamental principles of generative AI and LLMs and how they can be used for data
  3. Compare various approaches to using Generative AI and LLMs, demonstrating their practical relevance through real-world applications and case
  4. Describe the key challenges and opportunities, including issues related to reliability, hallucination, and ethical considerations in using Generative AI and LLMs in various

Pre-requisites:

Some basic understanding of either Python or R programming languages and ability to run Python scripts using Jupyter Notebooks.

Pedagogy:

Will be conducted by:

  • Presentations
  • Videos

Session Plan:

Sessions (1.5 hour)

 Topic & Objective

 Study Material

 TIME

Day-01 (6.5 hours) Lunch at 12:45-1330

1

Fundamentals of ML and NLP – I

Slides, articles and other reading materials

9:30-

1100

 

2

Supervised approaches for NLP: text classification and

sentiment analysis – I

Slides, articles and other reading materials

11:15-

12:45

3

Hands-on 1: text classification

and sentiment analysis

Jupyter notebooks and Python

scripts

13:30-

1500

4

Unsupervised approaches for

NLP: topic modeling

Slides, articles and other reading

materials

1515-

1645

(30 Min)

Reflection

1645-

1715

Day-02 (7.5 Hours) Lunch at 1400-1430

 

 

5

Hands-on 2: topic modeling and word vectors

Jupyter notebooks and Python Scripts

 

9-1030

 

6

Deep learning Models for NLP: vector representation of words, word vectors/embeddings

Slides, articles and other reading materials

1045-

1215

 

7

Hands-on 3: word vectors/ word embeddings

Jupyter notebooks and Python Scripts

1230-

1400

 

8

Introduction to LLMs: transformers architecture and generating text with

transformers

 

Slides, articles and other reading materials

 

1430-

1600

 

9

Introduction to prompting and configuring and fine tuning

LLMs for specific applications

Slides, articles and other reading materials

1615-

1745

Day-03 (6 hours) Lunch at 1400-1430

 

 

10

Data analysis using LLMs: text summarization and text classification

Slides, articles and other reading materials

 

9-1030

 

11

Hands-on 4: text summarization and text classification using

LLMs

Jupyter notebooks and Python Scripts

1045-

1215

 

12

LLMs use cases, challenges, opportunities, and ethical

considerations

Slides, articles and other reading materials

1230-

1400

13

Wrap-up: Discussion and the projects!

 

1430-

1600

 

Evaluation Criteria: 

Sr.No.

Component

Individual / Group

Weightage

1

Class attendance and active participation

Individual

10%

2

Online quizzes (2)

Individual

40%

3

Final project

Individual

50%

Total

100%

 

 Profile of Instructors:

Raghava Mukkamala: 

Raghava Mukkamala is an associate professor at the Department of Digitalization, Copenhagen Business School (CBS), Denmark. Raghava is also the programme director for the Master's Programme in Data Science at CBS and teaches several courses in Deep Learning and Natural Language Processing. His research primarily centered around Data Science, Blockchain Technologies, and Cybersecurity. His current research focuses on developing novel computational methods to analyze social media discourse, misinformation, and hate speech by combining formal/mathematical modeling techniques with advanced machine learning algorithms. As part of a pro-bono research collaboration with the United Nations High Commissioner for Refugees (UNHCR), he works on domain-adaptation and finetuning Large Language Models to identify hate speech and bias against refugees. Even though most of his research is mainly published in IEEE and ACM journals, he has also published several papers in FT-50/AJG-4*/ABDC-A* journals like the Journal of the Association for Information Systems (JAIS) and the Journal of Management Information Systems (JMIS). Raghava holds a Ph.D. in Theoretical Computer Science and an M.Sc. in Information Technology from IT University of Copenhagen, Denmark.

Link to homepage: https://www.cbs.dk/en/staff/rrmdigi

Shivshanker Singh Patel:

Dr. Shivshanker is Chair/Head of the Inter-disciplinary Decision Science & Analytics Lab (IDeAL) at the Indian Institute of Management (IIM) Visakhapatnam. He is currently a faculty in Decision Sciences Department at IIM Visakhapatnam. He has previously worked as a Manager at Mphasis-NextLabs in the data science domain and as an R&D Engineer with Mahindra & Mahindra Automotive. His interests lie in data science, game theory, forecasting, and optimization applied to scarce resource and logistics management. He is an alumnus of the Indian Institute of Science Bangalore, IIT  Roorkee, and NIT Raipur, with Degrees of Ph.D., Master and Bachelor respectively

Apply Now
Faculty Development Programs (FDP)


Title

Dates

Topics Covered (Indicative)

Mode (online/offline)

Capacity Building Program on Communication Skills (Online)

June 20-24, 2022

  • Public Speaking, 
  • Presentations, 
  • Professional Writing, and 
  • Interviews.

Online

Online Workshop on Communication Skills for Musaliar Institute of Management Students

January 24 to February 12, 2022

  • English Grammar & Usage
  • Common Errors in English
  • Effective Listening
  • Public Speaking
  • Making & Delivering Effective Presentations
  • Making & Delivering Effective Presentations
  • Professional email Etiquette
  • Writing SOPs
  • Preparing CV/Resume
  • Tips for success in an Interview

Online

Open FDP on Advanced Multivariate Data Analytics: Moderation and Mediation Analysis using AMOS & Process Macro

October 18 -22, 2021

  • Introduction to SEM
  • Measurement Model Assessment, Reliability and Validity
  • Path Analysis
  • Mediation
  • Moderation: Interaction Effect and Multi-Group Analysis
  • Confirmatory Factor Analysis
  • Moderated Mediation Analysis and Mediated Moderation Analysis

Online

Capacity-Building Workshop on Communication Skills

October 4 - 8, 2021

  • English Grammar & Usage
  • Common Errors in English
  • Effective Listening
  • Public Speaking
  • Making & Delivering Effective Presentations
  • Making & Delivering Effective Presentations
  • Professional email Etiquette
  • Writing SOPs
  • Preparing CV/Resume
  • Tips for success in an Interview

Online

AICTE ATAL Online FDP on Data Analytics for Research and Publication

October 4-8, 2021

  • Introduction to Multivariate Data Analytics
  • Mental Well Being
  • Multi-Criteria Decision Making
  • Qualitative Data Analysis
  • Model Building
  • Reliability and Validity of the Measurement Scale
  • Structure Equation Modelling
  • Panel Data Analysis
  • Bibliometric Analysis
  • How to Publish in High Impact Factor Journal

Online

Open FDP on Digital and Social Media Marketing

September 20 - 23, 2021

  • Introduction to Digital Marketing
  • Search Engine Optimization Keyword Optimization, Website Optimization
  • Search Engine Marketing, Affiliate Marketing, Content marketing
  • Social Media Marketing Strategies, social media, Marketing Campaign
  • Facebook Marketing, LinkedIn Marketing, Twitter
  • Marketing, Instagram Marketing, YouTube Marketing
  • Influencer Marketing, Customer Relationship Management using Digital Marketing
  • Measuring Effectiveness of Digital Marketing Campaign, Digital and Social Media Analytics

Online

Open FDP on Analytics

August 16-17, 2021

  • Overview of Analytics in Organizations
  • Introduction to R
  • Data Visualization
  • Descriptive Analytics and Inference
  • Predictive Analytics: Simple and Multiple Linear Regression
  • Predictive Analytics: Classification Forecasting

Online

Open FDP on Pedagogy and Research Methodology

July 5-9, 2021 &

July 12-16, 2021

  • Communication
  • ICT Enabled Teaching
  • Case Writing
  • Case Based Teaching
  • Questionnaire Design
  • Sampling Methods
  • Statistical Inference using R

Online

AICTE ATAL Online FDP on Data Analytics for Research and Publication

June 14-18, 2021

  • Introduction to Multivariate Data Analytics
  • Mental Well Being
  • Multi-criteria Decision Making
  • Qualitative Data Analysis
  • Model Building
  • Reliability and Validity of the Measurement Scale
  • Structure Equation Modelling
  • Panel Data Analysis
  • Bibliometric Analysis
  • How to Publish in High Impact Factor Journal

Online

Open FDP on Handling Partial Least Squares -

Structural Equation Modelling (PLS-SEM)

April 19-22 2021


  • Introduction to PLS-SEM
  • Mediation: Analysis
  • Sequential Mediation Analysis and Parallel Mediation Analysis
  • Moderation Analysis
  • Interaction Effect and Multi-Group Analysis


Online

National Institute of Business

Management (NIBM), Sri Lanka

January 3-4, 2020

  • Teaching and learning in higher education
  • Teaching with ICT
  • Case-based teaching
  • Experiential exercises in classrooms

Offline (Colombo, Sri Lanka)

Central Board of Secondary Education (CBSE): Leadership Development for School Principals

January 27 - 31, 2020

  • Leadership
  • Change Management
  • Ethics
  • Building cultures of well-being

Offline

National Project Implementation Unit (TEQIP)

FDP Name

Period

TEQIP

 15-19 Jul'19

TEQIP

26-30 Aug'19

TEQIP

 09-13 Dec'19

TEQIP

 02-06 Mar'20

TEQIP

26-28 Nov'20

TEQIP

7-9 Dec'20

TEQIP

17-19 Dec'20

TEQIP

11-13 Jan'21

TEQIP

27-29 Jan'21

TEQIP

9-11 Feb'21

TEQIP

17-19 Feb'21

TEQIP

15-17 Mar'21


  • Stakeholder Management
  • Entrepreneurship
  • Development in Academic Institutions
  • Accreditation & Ranking of Higher Educational Institutions
  • Time Management & Prioritization of Tasks
  • ICT-enabled Pedagogy
  • Costing, Pricing & Positioning of Adhoc Academic Programs

Online & Offline