ARTICLE AD BOX
By default, df.describe() does NOT show string columns. It only summarizes numeric columns. String (object) columns are not dropped — they are just not displayed.
You can check your code by this:
df.columns df.dtypes df.describe(include="all")HuggingFace Dataset objects can have a format attached (e.g., torch, numpy). When a format is set, only the formatted columns are returned unless explicitly told otherwise.
To get consistent columns in both Huggingface and Pandas dataframe, you can use this code:
from datasets import load_dataset pr_commits = load_dataset("hao-li/AIDev", "pr_commits")["train"] pr_commits.reset_format() # IMPORTANT commits_df = pr_commits.to_pandas() print(commits_df.columns) print(commits_df.dtypes)Hope it helps!
Explore related questions
See similar questions with these tags.
