csv file operation, excel file operation

1. What is a csv file

"""
csv File called comma-separated value file  -  Each line separates columns by commas
csv Files can be passed directly excel Open to save and display data as rows and columns, but relative excel File, it can only store data, not formulas and functions.
"""

csv read operation

import csv#Import Module


# 1) Create an open csv file
f = open('files/Film.csv', 'r', encoding='utf-8')

# 2) Create reader to get file contents
""" 
csv.reader(File object)        -       Getting the contents of a file returns an iterator and returns each row in a list
csv.DictReader(File object)    -       Getting the contents of a file returns an iterator and returns each line starting with the second line in a dictionary(The key to the dictionary is the first line of content)
"""
# reader1 = csv.reader(f)
# print(list(reader1))

reader2 = csv.DictReader(f)
print(list(reader2))

f.close()

Exercise: Calculate the average salary for data analysis positions in each city in the hook data

# Get File Content
from re import findall
with open('files/lagou.csv', encoding='utf-8') as f:
    reader = csv.DictReader(f)
    # Processing data
    cities = {}
    for job in reader:
        c = job['city']
        m = job['salary']           # '8k-16k'
        result = findall(r'(\d+)k-(\d+)k', m)            # [(8, 16)]  ->  (8, 16)
        if not result:
            continue
        start, end = result[0]

        moneys = cities.get(c, [])
        moneys.append((int(start) + int(end))/2)
        cities[c] = moneys

    # Round (number, decimal) - Let the number retain the decimal number of the specified number of digits
    for key in cities:
        print(key, round(sum(cities[key]) / len(cities[key]), 2))

csv file write operation

import csv

Open File
f = open('files/data.csv', 'w', encoding='utf-8')

Create a writer object

"""
a. csv.writer(File object)        -       Establish writer Object that corresponds to a list of rows when writing data

b. csv.DictWriter(File object, Key List)        -   Establish writer Object to write data in dictionaries
"""
writer = csv.writer(f)

Write one line at a time

writer.writerow(['Full name', 'Date of birth', 'Gender', 'Telephone'])
writer.writerow(['Xiao Ming', '1999-9-9', 'male', '110'])

Write multiple lines at a time

 writer.writerows([
     ['Floret', '2000-3-24', 'female', '120'],
     ['Zhang San', '19998-7-5', 'male', '119']
 ])


writer = csv.DictWriter(f, ['Full name', 'Date of birth', 'Gender', 'Telephone'])

Write the header of the file (write the keys of the dictionary to the beginning of the file)

writer.writeheader()

Write one line at a time

writer.writerow({'Full name': 'xiaoming', 'Date of birth': '1999-9-9', 'Telephone':'110', 'Gender': 'male'})

Write multiple lines at a time

writer.writerows([
    {'Full name': 'xiaohua', 'Date of birth': '1999-9-9', 'Telephone':'110', 'Gender': 'male'},
    {'Full name': 'zhangsan', 'Date of birth': '1999-9-9', 'Telephone':'110', 'Gender': 'male'}
])

1. System Environment - Installing Python provides a Python system environment for your computer

2. Virtual Environments - Python environments that programmers create themselves on demand

The premise of creating a virtual environment is that there is a system environment

3. Role of the environment:

(1) Provide a Python interpreter
(2) Provide third-party libraries
The existence of virtual environments allows third-party libraries to be managed separately by category or project.

4. Recommendations for using virtual environments:

(1) When working: a project has a virtual environment and places the virtual environment directly in the project
(2) When learning: one kind of project is a virtual environment, and all kinds of virtual environments are placed in one place

5. How to create a virtual environment

(1) Create using Pycharm
Open Settings page - >

(2) Create using instructions

1. Understanding excel files

import openpyxl#Call third-party libraries
"""
Workbook: One excel A file is a workbook
 Worksheet: You can have more than one worksheet in a workbook(At least one)
Cell: Cell is excel The basic unit in which files hold data
 Row and column numbers: cell positions can be determined
"""

2. Get excel file contents

(1)open excel File Creation Workbook Object
"""
openpyxl.open(excel File Path)    
openpyxl.load_workbook(excel File Path)
"""
workbook = openpyxl.open('files/Three Kingdoms Person Data.xlsx')
workbook = openpyxl.load_workbook('files/Three Kingdoms Person Data.xlsx')

Gets the table names of all sheets in the workbook

result = workbook.sheetnames
print(result)       # ['All Person Data','Force of Three Countries','Force of Three Military Generals','Data of Three Military Generals','Data of Three Military Officers']

(2) Getting worksheets

"""
Workbook.active        -      Get active tables(Selected table) 
Workbook[Sheet Name]        -   Gets the worksheet with the specified name
"""
sheet1 = workbook.active
print(sheet1)

sheet2 = workbook['Three Kingdoms General Data']
print(sheet2)

(3) Get cells

"""
Sheet Object.cell(Line Number, Column number)
"""
cell1 = sheet2.cell(8, 1)
cell2 = sheet2.cell(12, 1)
print(cell1, cell2)

(4) Get cell contents

# Cell object.value
print(cell1.value)
print(cell2.value)

(5) Get the maximum row and column numbers (valid rows and valid columns that hold data)

# Worksheet object.max_row
# Worksheet object.max_column
print(sheet2.max_row)
print(sheet2.max_column)

Get all the data in the first column

column1 = []
for row in range(1, sheet2.max_row+1):
    cell = sheet2.cell(row, 1)
    column1.append(cell.value)

print(column1)

Get all the data in columns 1 through 3

for col in range(1, 4):
    column = []
    for row in range(1, sheet2.max_row+1):
        cell = sheet2.cell(row, col)
        column.append(cell.value)
    print(column)

import openpyxl

Note: No matter how excel is written, it must be saved after the operation is completed.

1. New Workbook

(1) New Workbook Object

workbook = openpyxl.Workbook()

(2) Save

# Workbook object.save (file path)
workbook.save('files/student.xlsx')

When creating a new workbook in practice, you need to first determine if the corresponding file of the workbook already exists, if it exists, you do not need to create a new one, if it does not exist, you can create a new one

Method 1:

try:
    workbook = openpyxl.open('files/student2.xlsx')
except FileNotFoundError:
    workbook = openpyxl.Workbook()
    workbook.save('files/student2.xlsx')

Method 2:

import os
# # Os.path. Exists (File Path) - Determines whether the specified file exists, returns True if it exists, and returns False if it does not exist
if os.path.exists('files/student2.xlsx'):
    workbook = openpyxl.open('files/student2.xlsx')
else:
    workbook = openpyxl.Workbook()
    workbook.save('files/student2.xlsx')

2. Writing of worksheets

(1) New worksheet

# Workbook object. Create_ Sheet (table name, subscript)
workbook.create_sheet()
workbook.create_sheet('Python')
workbook.create_sheet('Java', 0)
workbook.save('files/student2.xlsx')

New tables in practice: create new tables when you don't have time, and sometimes open them directly

if 'Python' in workbook.sheetnames:
    sheet = workbook['Python']
else:
    sheet = workbook.create_sheet('Python')
    workbook.save('files/student2.xlsx')

(2) Delete worksheets

# Workbook object.remove (worksheet object)
workbook.remove(workbook['Sheet1'])
workbook.save('files/student2.xlsx')

Delete tables in practice: Delete tables when they exist

if 'Sheet1' in workbook.sheetnames:
    workbook.remove(workbook['Sheet1'])
    workbook.save('files/student2.xlsx')

3. Writing of Cells

# Cell object. value = Data
java_sheet = workbook['Java']
java_sheet.cell(1, 3).value = 'Telephone'
java_sheet.cell(2, 1).value = None
java_sheet.cell(4, 2).value = 'stu003'

workbook.save('files/student2.xlsx')

Tags: Python Data Analysis Data Mining

Posted by unmash on Sat, 10 Sep 2022 21:50:10 +0530