story background:
openai has released the api call interface, and Peking University released a chatexcel tool a few days ago. These two things have nothing to do with each other, but engineers always find something to do when they have nothing to do. In a technology group, I bragged to people that if openai opens up the api, I can also make a chatexcel, which is even better than what they do.
1. To meet the needs of natural language
2. Can accurately understand user needs
3. Can give accurate analysis results
4. It is necessary to give a visual presentation report
5. If possible, it is best to make a ppt presentation
Well, then it is the road to filling the pit again, in order to quickly make the minimum price poc of the product. So I used openai api+visual chatgpt together. In fact, if you really want to make products, these must be packaged and made with openai api. For users, there is only a demand interaction box and a place to input data cvs table. Here I am verifying the upper and lower bounds of the product, so please allow me to make a harmless foul.
The idea is as follows:
1. After the user enters the form, the header is parsed, and the meta information is parsed out, which is ready for subsequent user demand analysis
2. Format the input description first, and let the openai API generate the code for automatic data analysis (when it is commercialized, the broad needs of users can be converted into formatted input through openai)
3. Parse the generated python code and save it in .py format
4. Use the python os package to execute the python script, and convert the data visualization into html format for easy clicking to view
Let's see the actual effect:
Call the api to generate codes through natural language descriptions, and the resulting codes are stacked together, so parsing is required

Code analysis, here is a lazy one, let chatgpt help analyze the results

The code generated by openai is analyzed as follows. Because of the version of the package, there are some version conflicts. In order to quickly verify, I gave up resolving the conflict and asked chatgpt to help generate the code for the task again.
import pandas as pd import jieba from pyecharts import WordCloud # read in the data from the CSV file data = pd.read_csv('product_reviews.csv') # split product reviews into individual words reviews = data['Product Review'] word_freq = {} for review in reviews: words = jieba.cut(review) for word in words: if word in word_freq: word_freq[word] += 1 else: word_freq[word] = 1 # sort the words by frequency sorted_word_freq = sorted(word_freq.items(), key=lambda x: x[1], reverse=True) # print the top 10 most frequent words print('Top 10 most frequent words:') for word, freq in sorted_word_freq[:10]: print(f'{word}: {freq}') # create a word cloud of the top 50 most frequent words wordcloud = WordCloud(width=800, height=620) wordcloud.add("", sorted_word_freq[:50], word_size_range=[20, 100]) wordcloud.render('wordcloud.html')
chatgpt parsing task, generate code
There are some small bug s in the above code
So I directly tested using chatgpt to generate code


import pandas as pd import jieba from wordcloud import WordCloud import matplotlib.pyplot as plt from collections import Counter # read the data from the table df = pd.read_csv('product_reviews.csv') # create a list of stop words stop_words = ['of', 'up', 'yes', 'I', 'you', 'he', 'she', 'us', 'you', 'them'] # tokenize the product reviews and count the frequency of each word words_list = [] for review in df['Product Review']: words = jieba.lcut(review) words_list.extend(words) words_freq = Counter(words_list) # remove stop words from the word frequency dictionary for stop_word in stop_words: words_freq.pop(stop_word, None) # sort the word frequency dictionary by descending order of frequency sorted_words_freq = sorted(words_freq.items(), key=lambda x: x[1], reverse=True) # print the top 10 most frequent words print('Top 10 most frequent words in product reviews:') for word, freq in sorted_words_freq[:10]: print(f'{word}: {freq}') # create a word cloud using the top 50 most frequent words wordcloud = WordCloud(background_color='white', width=800, height=400).generate_from_frequencies(words_freq.most_common(50)) # plot the word cloud plt.figure(figsize=(12, 6)) plt.imshow(wordcloud, interpolation='bilinear') plt.axis('off') plt.show() # save the word cloud as an HTML file wordcloud.to_file('wordcloud.html')
even gives what packages to install
Let chatgpt help generate some data for testing

import csv import random # List of product names product_names = ['Product A', 'Product B', 'Product C', 'Product D', 'Product E'] # Generate product review data product_reviews = [] for i in range(1000): # Pick a product name at random product_name = random.choice(product_names) # Generate a random comment product_review = f"This is a great {product_name}!" # Randomly generate impressions and clicks num_exposures = random.randint(1, 100) click_count = random.randint(0, num_exposures) # Add to product review list product_reviews.append([product_name, product_review, num_exposures, click_count]) # Write product review data to a CSV file with open('product_reviews.csv', mode='w', newline='') as csv_file: writer = csv.writer(csv_file) # write header writer.writerow(['Product Name', 'Product Review', 'Number of Exposures', 'Click Count']) # data input writer.writerows(product_reviews)
Integrate project test code effect

generate data

Code generated word cloud
summary:
1. On the whole, chatgpt is already very powerful, and can basically do a very good job in every link. There are only minor problems in some versions and data conversion (list and dictionary)
2. By dismantling specific tasks and connecting the processes well, chatgpt can basically solve the actual work production completely
3. Using a single system chatgpt or openai API + simple business process in series, the effect of developing ai application products is amazing
4. Future product interaction will be more humanized and concise
5. For the small problem of 1, I think it can be completely solved through the domain code fintune