Python pandas.DataFrame.astype函数方法的使用-CJavaPy

DataFrame.astype(dtype, copy=True, errors='raise', **kwargs)

将pandas对象转换为指定的dtype dtype。

参数：

dtype ：数据类型或列名称 - >数据类型

使用numpy.dtype或Python类型，

将整个pandas对象强制转换为相同的类型。

或者，使用{col：dtype，...}，其中col是列标签，

dtype是numpy.dtype或Python类型，

用于将一个或多个DataFrame列转换为特定于列的类型。

copy ： bool，默认为True

返回副本时copy=True（设置copy=False为更改值时要非常小心

，然后可能会传播到其他pandas对象）。

errors ： {‘raise’, ‘ignore’}，默认‘raise’

控制提供dtype的无效数据的异常。

raise ：允许引发异常

ignore：抑制异常。出错时返回原始对象

版本0.20.0中的新功能。

kwargs ：传递给构造函数的关键字参数

casted ：与调用者类型相同

例子，

1）转换整个 DataFrame 的数据类型

import pandas as pd

# 创建一个示例 DataFrame
df = pd.DataFrame({
    'A': [1, 2, 3],
    'B': [4.0, 5.1, 6.2],
    'C': ['7', '8', '9']
})

print("原始 DataFrame：")
print(df)
print(df.dtypes)

# 将所有列转换为字符串类型
df_str = df.astype(str)

print("\n转换为字符串类型后的 DataFrame：")
print(df_str)
print(df_str.dtypes)

2）转换特定列的数据类型

# 将列 'C' 转换为整数类型
df['C'] = df['C'].astype(int)

print("\n将列 'C' 转换为整数类型后的 DataFrame：")
print(df)
print(df.dtypes)

3）使用字典同时转换多个列的数据类型

# 使用字典同时转换多列的数据类型
df = df.astype({'A': 'float64', 'B': 'int64'})

print("\n同时转换多列类型后的 DataFrame：")
print(df)
print(df.dtypes)

4）处理转换错误

# 示例 DataFrame，其中包含无法转换为数字的值
df_with_errors = pd.DataFrame({
    'A': ['1', '2', 'three'],
    'B': ['4.0', '5.1', '6.2']
})

print("\n包含错误的 DataFrame：")
print(df_with_errors)

# 尝试将列 'A' 转换为整数类型，忽略错误
df_with_errors['A'] = df_with_errors['A'].astype(int, errors='ignore')

print("\n忽略错误后的 DataFrame：")
print(df_with_errors)
print(df_with_errors.dtypes)

5）使用示例

import pandas as pd

# 创建一个示例 Series，dtype 为 int32
ser = pd.Series([1, 2], dtype='int32')
print("原始 Series：")
print(ser)

# 转换为 int64 类型
ser_int64 = ser.astype('int64')
print("\n转换为 int64 类型后的 Series：")
print(ser_int64)

# 转换为分类类型
ser_category = ser.astype('category')
print("\n转换为分类类型后的 Series：")
print(ser_category)

# 转换为自定义排序的有序分类类型
cat_dtype = pd.api.types.CategoricalDtype(categories=[2, 1], ordered=True)
ser_ordered_category = ser.astype(cat_dtype)
print("\n转换为自定义排序的有序分类类型后的 Series：")
print(ser_ordered_category)

# 注意 copy=False 的行为
s1 = pd.Series([1, 2])
s2 = s1.astype('int64', copy=False)
s2[0] = 10

print("\ns1 在 s2 更改后的值：")
print(s1)  # 注意 s1[0] 也发生了变化

Python pandas.DataFrame.astype函数方法的使用

Python 2.7中安装pip的方法及步骤

Python numpy.full函数方法的使用

Java JDK11 在windows上的安装和环境变量配置

Java Stream使用多个过滤器(filter)或复杂条件方法用法及简单写法代码

Java JDK11 在Mac上的安装和配置以及JDK多个版本之间切换

Python PIP升级后执行命令报错： sys.stderr.write(f"ERROR: {exc}")解决方法

Python pandas.to_numeric函数方法的使用

Python numpy.fromfile函数方法的使用