Python数据说明-看了这篇文章，数据洗濯你也就完全把握了

发布时间：2019-09-16 00:24:22 所属栏目：教程来源：哗啦圈的梦

导读：全部做数据说明的条件就是：你得稀有据，并且已经颠末洗濯，清算成必要的名目。不管你从那边获取了数据，你都必要当真细心调查你的数据，对不合规的数据举办整理，固然不是说必然要有这个步调，可是这是一个好风俗，由于保不齐后头说明的时辰发明之前由于

2、append

1.result = df1.append(df2) 
2.result = df1.append(df4) 
3.result = df1.append([df2, df3]) 
4.result = df1.append(df4, ignore_index=True)

4、join

left.join(right, on=key_or_keys)

1.result = left.join(right, on='key') 
2.result = left.join(right, on=['key1', 'key2']) 
3.result = left.join(right, on=['key1', 'key2'], how='inner')

5、concat

1.result = pd.concat([df1, df4], axis=1) 
2.result = pd.concat([df1, df4], axis=1, join='inner') 
3.result = pd.concat([df1, df4], axis=1, join_axes=[df1.index]) 
4.result = pd.concat([df1, df4], ignore_index=True)

文本处理赏罚：

1. lower()函数示例

s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu']) 
s.str.lower()

2. upper()函数示例

s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu']) 
s.str.upper()

3. len()计数

s = pd.Series(['Tom', 'William Rick', 'John', 'Alber@t', np.nan, '1234','SteveMinsu']) 
s.str.len()

4. strip()去除空格

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.strip()

5. split(pattern)切分字符串

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.split(' ')

6. cat(sep=pattern)归并字符串

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.cat(sep=' <=> ') 
执行上面示例代码，获得以下功效 - 
Tom <=> William Rick <=> John <=> Alber@t

7. get_dummies()用sep拆分每个字符串，返回一个假造/指示dataFrame

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.get_dummies()

8. contains()判定字符串中是否包括子串true; pat str或正则表达式

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.contains(' ')

9. replace(a,b)将值pat替代为值b。

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
.str.replace('@','$')

10. repeat(value)一再每个元素指定的次数

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
s.str.repeat(2)

执行上面示例代码，获得以下功效 -

0 Tom Tom
1 William Rick William Rick
2 JohnJohn
3 Alber@tAlber@t

11. count(pattern)子串呈现次数

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
print ("The number of 'm's in each string:") 
print (s.str.count('m'))

执行上面示例代码，获得以下功效 -

The number of 'm's in each string:

12. startswith(pattern)字符串开头是否匹配子串True

s = pd.Series(['Tom ', ' William Rick', 'John', 'Alber@t']) 
print ("Strings that start with 'T':") 
print (s.str. startswith ('T'))

执行上面示例代码，获得以下功效 -

Strings that start with 'T':

0 True
1 False
2 False
3 False

（编辑：湖南网）

【声明】本站内容均来自网络，其相关言论仅代表作者个人观点，不代表本站立场。若无意侵犯到您的权利，请及时与联系站长删除相关内容!

5/7

首页

尾页

教你如何安装ghost xp	深度技术Ghost xp系统
ghost xp sp3电脑公司	8187无线网卡驱动,教您