Gridscript

🐍 Python for Data Science

πŸ“˜ Why Python for Data Science

Python is one of the most popular programming languages for Data Science due to its:

Python allows you to go from data collection to model building to visualization β€” all within one language.

🧩 Python Syntax & Data Types

Python uses a clean, easy-to-read syntax.
Here are some basic concepts and data types used in almost every program.

Basic Syntax

# This is a comment
print("Hello, Data Science!")

Variables

Variables are used to store data.

name = "Alice"
age = 25
height = 1.68

Data Types

TypeExampleDescription
int10Integer (whole number)
float3.14Decimal number
str"Data"Text (string)
boolTrue / FalseBoolean values
list[1, 2, 3]Ordered collection
dict{"name": "Bob", "age": 30}Key-value pairs

πŸ“‹ Lists, Dictionaries, Loops, and Functions

Lists

Lists are used to store multiple items in one variable.

numbers = [10, 20, 30]
print(numbers[0])  # Access the first element
numbers.append(40)  # Add a new element

Dictionaries

Dictionaries store data as key-value pairs.

person = {"name": "Alice", "age": 25}
print(person["name"])  # Access value by key
person["city"] = "London"  # Add a new key-value pair

Loops

Loops help repeat actions efficiently.

For loop

for i in range(5):
    print(i)

While loop

count = 0
while count < 3:
    print("Count:", count)
    count += 1

Functions

Functions group reusable blocks of code.

def greet(name):
    return f"Hello, {name}!"

print(greet("Data Scientist"))

πŸ’» Using Jupyter Notebook and VS Code

Jupyter Notebook

Jupyter Notebook is an interactive environment for data analysis, visualization, and experimentation.

You can organize code in cells and display charts or tables inline.
It’s commonly used for:

To run Jupyter:

pip install jupyter
jupyter notebook

VS Code

VS Code is a general-purpose code editor with extensions for Python and Data Science.

It provides:

Recommended Extensions:

πŸ“š Popular Python Libraries for Data Science

1. NumPy

Used for numerical and matrix operations.

import numpy as np

arr = np.array([1, 2, 3, 4])
print(arr.mean())  # Average
print(arr * 2)     # Vectorized operations

Key features:

2. pandas

Used for data manipulation and analysis.

import pandas as pd

data = {"Name": ["Alice", "Bob"], "Age": [25, 30]}
df = pd.DataFrame(data)

print(df)
print(df["Age"].mean())  # Average age

Key features:

3. Matplotlib

Used for data visualization.

import matplotlib.pyplot as plt

x = [1, 2, 3, 4]
y = [10, 20, 25, 30]

plt.plot(x, y, marker='o')
plt.title("Simple Line Chart")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

Key features:

🧠 Summary

ConceptDescription
PythonThe main programming language for data science
Data Typesint, float, str, bool, list, dict
Core ConceptsLoops, functions, variables
ToolsJupyter Notebook (interactive), VS Code (editor)
LibrariesNumPy (math), pandas (data analysis), Matplotlib (visualization)

With these foundations, you’re ready to start working with real datasets and exploring the world of Data Science!