Best_Practice_For_Python Best Practice Guide

1. Organize your codebase reasonably and choose the right code management tool

The structure of a common python project codebase roughly includes:


2. Follow a common code style

PEP8 - Python community bible, because of this, almost all code in the Python community looks the same style.

  • File encoding and unicode

    1. Source files use UTF-8 encoding format without BOM
    2. Use UNIX \n style newlines instead of dos \r\n
    3. py file header contains: #!/usr/bin/env python # -*- coding: utf-8 -*-
  • name

    1. Lowercase: function, method, variable, package, module
    2. CamelCase: class name
    3. Starting with a single underscore: protected methods and intrinsics
    4. Double underscore: private method
    5. All caps: constant
  • Module import:

    1. Import the entire module, not the symbols in the module (classes, functions, etc.), unless the third package explicitly says to import symbols.

    Reason: This can effectively avoid circular imports.

    *I personally understand that there is a problem here. If there are additional function calls in py, data modification actions, import * will be implicitly executed, and the problem is difficult to find

    1. Import modules at the beginning of the file, and import them by attributes of system modules, third-party modules, and local models.

Reason: You can clearly understand the source of the module. IDE has shortcut key ctrl+thift+o to optimize import

  • indentation:

    4 spaces, no Tab

  • Familiarize yourself with common conventions

    To understand the role of, To understand the role of if __name__ == "__main__": To understand the process of package loading: the role of sys.path, sys.modules

  • Notes

    refer to PEP257 , project documentation can be generated using reStructuredText and Sphinx

  • Don't make a single line of code too long, use parentheses for line breaks or \ for line breaks

  • Reasonably use blank lines to make the code more readable

  • Keep the code as flat as possible, don't nest too deeply

    1. if, for, while, try/except/finally, with do not nest too much
    2. When the method is defined and called, there are too many parameters, and unreasonable line breaks will also cause the code structure to be unsightly.

​ [External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-Vxzy1JQX-1593745654598)(file:///C:\Users\W00448~1\AppData\Local\ Temp\msohtmlclip1\01\clip_image002.png)]

  • Write as many unit tests as possible:
    1. The unit is smaller and the test is faster (of course slower is better than nothing), usually one TestCase per class
    2. After unit testing, refactoring is more reassuring
    3. Integration testing: by use case and scenario

3. Repair broken windows immediately

The same theory of broken windows applies to coding, and when you see anything wrong with your code (whether it's bad design, bad decisions, or bad code), fix it right away.

4. Refactor your code

Refactoring while developing. When writing code, if you find something wrong, you should refactor, modify, and test it in time.

Sometimes I think, do it first and then refactor, and forget about this idea. This idea is not advisable. The further back you drag, the harder it is to refactor, and the more lazy you are to refactor.

Refactoring is a long-term ongoing activity.

String splicing, a large number of splicing use +, the performance is not high

# py  
s = "".join(["Life", "is", "short", "I", "love", "Python"]) # better  
s = "Life" + "is" + "short" + "I" + "love" + "Python" # worse

use of in

 # Use in as much as possible, use __contains__ to support in, and use __iter__ to support for x in y
for key in d: print(key) # better
for key in d.keys(): print(key) # worse, but you need to modify the dictionary while traversing, you can do this

if key not in d: d[key] = 1 # better
if not key in d: d[key] = 1 # worse

use of if

if not x: Better than: if x == 0, if x == "", if x is None, if x == False  
if x: better than if x != 0, if x is not None, if x != None  
or use if len(x) ?= 0 to judge
if 1 < x < 5 better than if x > 1 and x < 5  # Python operators support cascading

if x is None and if x is not None better than if x == None and if x != None  
if x == 1 better than if (x == 1) # The following parentheses are redundant  

Swap the values ​​of variables, don't use temporary variables

​ a, b = b, a a, b, c, d = d, b, c, a

Code performance optimization

  1. Perform performance analysis before optimization to avoid blind optimization, which affects the readability of the code.
  2. Try to use algorithms with less time complexity. Generally speaking, when the amount of data is large, the advantages of a good algorithm are more obvious.

  3. Use the simplest way that works, e.g. using .startsWith is better than using regex

  4. **use dict or set to do element lookup instead of using list**
  5. Use map/filter to process elements in list instead of for loop

  6. Use list comprehension instead of for loop, but the logic in the comprehension is complicated, and the readability of using for is high

  7. Try to reduce the number of traversal times. One traversal can be processed, and do not traverse multiple times.

​ [The external link image transfer failed, the origin site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-c0jjgdWH-1593745654602)(file:///C:\Users\W00448~1\AppData\Local\ Temp\msohtmlclip1\01\clip_image004.png)]

  1. When you need to get multiple values ​​from redis, you can use pipes to reduce the number of IO interactions.

  2. Use multithreading or coroutines to process IO Intensive tasks.

Reusable Object Oriented Design

  1. Reasonable use of composition, inheritance, and mixin for code reuse

​ Use composition first (Has A), followed by inheritance (must be Is A). Using mixins to implement multiple inheritance can make the code look cleaner.

​ All three methods are used in the search platform, such as the database model IspTable. Algorithm model ModelMixin, tokenizer JiebaTokenizer.

  1. Reasonably build reusable toolkits and tool classes

​ DBHelper,LogHelper,app.common.utils

  1. use monkey-patch Or the way of inheritance to extend the functions of third-party packages.

​ Such as: the implementation in jieba_wrap

​ [External link image transfer failed, the source site may have an anti-leech mechanism, it is recommended to save the image and upload it directly (img-uUyFM0o6-1593745654603)(file:///C:\Users\W00448~1\AppData\Local\ Temp\msohtmlclip1\01\clip_image006.png)]

multithreading, multiprocessing

  1. Don't use multithreading for computationally intensive tasks, it doesn't make sense. Using multithreading for IO-intensive tasks can improve processing speed.

  2. When multiple processes write logs concurrently to the same file, there will be problems with the built-in log package. Use concurrent_log_handler

Design Patterns

Reasonable use of design patterns can simplify the corresponding programming tasks and make the code clear and readable.

  1. Decorator patterns: log, exception, performance, etc.

  2. Singleton pattern: Jieba Tokenizer, segregated by tenant granularity
  3. Publish-subscribe mode, logical decoupling

Common class naming conventions

  1. Prefixes: SimpleXXX, DefaultXXX, StandardXXX

  2. Suffix: XXXService, XXXMixin, XXXHelper, XXXError, XXXException, XXXEnum

  3. Interface: used in the declaration is more XXXable,Indicates that the implementation class has this capability, such as Runable,Configurable,Customizable,Imutable,Iterable, Cloneable, Serializable...
  4. Implementation class: the declaration of more use XXXRunner, XXXConfiguration Equinounal structure.

5. Create consistent documentation

It may seem like a burden, but proper documentation is the cornerstone of producing clean code in the life of a project. The Python community uses three simple tools or concepts to simplify documentation:

  • reStructredText

  • Docstrings

  • Sphinx

With these three tools, you can write comments (Docstrings) that conform to the syntax of reStructredText in Python code, and then use Sphinx to automatically produce documents (PDF, HTML...) in various formats of the project. and can publish your project's documentation to ReadTheDocs Document library for others to view online.

A large number of open source projects maintain documentation this way:

  • flask

  • gunicorn

  • torando

6. Proficient in use PyPI

Get other people's projects from PyPI and publish your project to PyPI.

pip install requests

7. Reference books

• The Python Cookbook

• Fluent Python

• Effective Python: 59 Specific Ways to Write Better Python

8. The Zen of Python

The Zen of Python, by Tim Peters 

 Beautiful is better than ugly.
 Beautiful code is better than ugly.

 Explicit is better than implicit.
 Explicit is better than implicit.

 Simple is better than complex.
 Simple is better than complex.

 Complex is better than complicated.
 Complicated is better than intricate.

 Flat is better than nested.
 Flat is better than nested.

 Sparse is better than dense.
 Sparse is better than dense.

 Readability counts.
 Readability is very important.

 Special cases aren't special enough to break the rules.
 Although practicality beats purity.
 Instances are not special enough to break the rules, though practicality trumps purity.

 Errors should never pass silently.
 Unless explicitly silenced.
 Exceptions should not be ignored unless you do so explicitly.

 In the face of ambiguity, refuse the temptation to guess.
 When ambiguous, don't guess.

 There should be one-- and preferably only one --obvious way to do it.
 Although that way may not be obvious at first unless you're Dutch.
 There should be one (preferably only one) obvious way to do it. Although at first it wasn't so obvious that way.

 Now is better than never.
 Although never is often better than *right* now.
 Better now than never. Although never often is better than immediately.

 If the implementation is hard to explain, it's a bad idea.
 If the implementation is easy to explain, it may be a good idea.
 One implementation is easy to explain, and that's a good idea, otherwise it's a bad idea.

 Namespaces are one honking great idea 
 Namespaces are a great idea.

 -- let's do more of those!

Posted by ivytony on Wed, 01 Jun 2022 19:26:26 +0530