python panda怎麼提取列資料？

首頁>Club>2021-02-02 16:04

python panda怎麼提取列資料？

回覆列表

1 # 使用者1465424935672

建立資料

透過Python的zip構造出一元組組成的列表作為DataFrame的輸入資料rec。

In [3]: import pandas as pd

In [4]: import random

In [5]: num = random.sample(xrange(10000, 1000000), 5)
In [6]: num

Out[6]: [244937, 132008, 278446, 613409, 799201]

In [8]: names = "hello the cruel world en".split()

In [9]: names

Out[9]: ["hello", "the", "cruel", "world", "en"]

In [10]: rec = zip(names, num)

In [15]: data = pd.DataFrame(rec, columns = [u"姓名",u"業績" ])

In [16]: data
Out[16]:

姓名業績

0 hello 244937

1 the 132008

2 cruel 278446

3 world 613409

4 en 799201

DataFrame方法函式的第一個引數是資料來源，第二個引數columns是輸出資料表的表頭，或者說是表格的欄位名。

匯出資料csv

Windows平臺上的編碼問題，我們可以先做個簡單處理，是ipython-notebook支援utf8.

import sys

reload(sys)

sys.setdefaultencoding("utf8")
接下來可以資料匯出了。

In [31]: data

Out[31]:

姓名業績

0 hello 244937

1 the 132008

2 cruel 278446

3 world 613409

4 en 799201

#在ipython-note裡後加問號可查幫助，q退出幫助

In [32]: data.to_csv?

In [33]: data.to_csv("c:\\out.csv", index = True, header = [u"僱員", u"銷售業績"])

將data匯出到out.csv檔案裡，index引數是指是否有主索引，header如果不指定則是以data裡columns為頭，如果指定則是以後邊列表裡的字串為表頭，但要注意的是header後的字串列表的個數要和data裡的columns欄位個數相同。
可到c盤用Notepad++開啟out.csv看看。

簡單的資料分析

In [43]: data

Out[43]:

姓名業績

0 hello 244937

1 the 132008

2 cruel 278446

3 world 613409

4 en 799201

#排序並取前三名

In [46]: Sorted = data.sort([u"業績"], ascending=False)

Sorted.head(3)

Out[46]:

姓名業績

4 en 799201
3 world 613409

2 cruel 278446

圖形輸出

In [71]: import matplotlib.pyplot as plt

#使ipython-notebook支援matplotlib繪圖

%matplotlib inline

In [74]: df = data

#繪圖

df[u"業績"].plot()

MaxValue = df[u"業績"].max()

MaxName = df[u"姓名"][df[u"業績"] == df[u"業績"].max()].values

Text = str(MaxValue) + " - " + MaxName
#給圖新增文字標註

plt.annotate(Text, xy=(1, MaxValue), xytext=(8, 0), xycoords=("axes fraction", "data"), textcoords="offset points")

如果註釋掉plt.annotate這行

∧ 中秋節和大豐收的關聯？

∨ 來一段深情的告白？

熱門排行

劇多

python panda怎麼提取列資料？