{"id":80,"date":"2025-06-15T10:13:00","date_gmt":"2025-06-15T10:13:00","guid":{"rendered":"https:\/\/vicfolio.com\/blog\/?p=80"},"modified":"2025-06-11T16:18:07","modified_gmt":"2025-06-11T16:18:07","slug":"analisis-de-datos-con-pandas-guia-practica-para-e-commerce","status":"publish","type":"post","link":"https:\/\/vicfolio.com\/blog\/?p=80","title":{"rendered":"An\u00e1lisis de Datos con Pandas: Gu\u00eda Pr\u00e1ctica para E-commerce"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\"><strong>Introducci\u00f3n al An\u00e1lisis con Pandas<\/strong><\/h2>\n\n\n\n<p>En este tutorial pr\u00e1ctico, aprender\u00e1s a procesar y analizar datos de ventas e-commerce usando Pandas. Trabajaremos con un dataset real que contiene:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Informaci\u00f3n de pedidos<\/li>\n\n\n\n<li>Datos de productos<\/li>\n\n\n\n<li>Registros de clientes<\/li>\n\n\n\n<li>Ubicaciones geogr\u00e1ficas<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>1. Configuraci\u00f3n Inicial<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Instalaci\u00f3n de Bibliotecas<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>pip install pandas numpy matplotlib seaborn plotly<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Importaci\u00f3n de M\u00f3dulos<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import pandas as pd\nimport numpy as np\nimport matplotlib.pyplot as plt\nimport seaborn as sns<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>2. Exploraci\u00f3n Inicial de Datos<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Carga del Dataset<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>df = pd.read_csv('ventas_ecommerce.csv')<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Primera Inspecci\u00f3n<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Vista general\nprint(df.info())\n\n# Estad\u00edsticas descriptivas\nprint(df.describe())\n\n# Valores \u00fanicos por columna\nprint(df.nunique())<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>3. Limpieza de Datos Profesional<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Manejo de Valores Faltantes<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Porcentaje de valores nulos por columna\nprint(df.isnull().mean() * 100)\n\n# Estrategias de limpieza\ndf&#91;'columna_critica'] = df&#91;'columna_critica'].fillna(df&#91;'columna_critica'].median())\ndf.dropna(subset=&#91;'columna_importante'], inplace=True)<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Eliminaci\u00f3n de Duplicados<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>df.drop_duplicates(inplace=True)<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Correcci\u00f3n de Formatos<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Conversi\u00f3n de tipos\ndf&#91;'fecha'] = pd.to_datetime(df&#91;'fecha'])\ndf&#91;'precio'] = df&#91;'precio'].astype(float)\n\n# Normalizaci\u00f3n de textos\ndf&#91;'categoria'] = df&#91;'categoria'].str.upper().str.strip()<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>4. An\u00e1lisis Avanzado<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Feature Engineering<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Creaci\u00f3n de nuevas caracter\u00edsticas\ndf&#91;'valor_total'] = df&#91;'cantidad'] * df&#91;'precio']\ndf&#91;'mes_venta'] = df&#91;'fecha'].dt.month<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Agregaciones con GroupBy<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>ventas_por_categoria = df.groupby('categoria').agg({\n    'valor_total': &#91;'sum', 'mean', 'count'],\n    'cliente_id': pd.Series.nunique\n}).rename(columns={'cliente_id': 'clientes_unicos'})<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>An\u00e1lisis Temporal<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>ventas_mensuales = df.resample('M', on='fecha')&#91;'valor_total'].sum()\nventas_mensuales.plot(title='Ventas Mensuales')<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>5. Visualizaci\u00f3n de Datos<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Gr\u00e1ficos B\u00e1sicos con Pandas<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>df&#91;'categoria'].value_counts().plot(kind='bar', title='Ventas por Categor\u00eda')<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Visualizaci\u00f3n Avanzada con Seaborn<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>sns.boxplot(x='categoria', y='valor_total', data=df)\nplt.title('Distribuci\u00f3n de Ventas por Categor\u00eda')\nplt.xticks(rotation=45)<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Dashboard Interactivo con Plotly<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>import plotly.express as px\n\nfig = px.sunburst(df, path=&#91;'region', 'categoria'], values='valor_total')\nfig.show()<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>6. Optimizaci\u00f3n de Rendimiento<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>T\u00e9cnicas de Vectorizaci\u00f3n<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Anti-patr\u00f3n (lento)\nfor i, row in df.iterrows():\n    df.at&#91;i, 'nueva_col'] = row&#91;'precio'] * 1.1\n\n# Buenas pr\u00e1cticas (r\u00e1pido)\ndf&#91;'nueva_col'] = df&#91;'precio'] * 1.1<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Reducci\u00f3n de Memoria<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Conversi\u00f3n a tipos \u00f3ptimos\ndf&#91;'id_producto'] = df&#91;'id_producto'].astype('category')\ndf&#91;'precio'] = pd.to_numeric(df&#91;'precio'], downcast='float')<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>7. Casos de Negocio Reales<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>An\u00e1lisis de Cohortes<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Creaci\u00f3n de cohortes\ndf&#91;'cohorte'] = df&#91;'fecha'].dt.to_period('M')\ncohortes = df.groupby(&#91;'cohorte', 'mes_venta']).agg({'cliente_id': pd.Series.nunique})<\/code><\/pre>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Pruebas A\/B<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code>grupo_control = df&#91;df&#91;'grupo'] == 'A']&#91;'conversion']\ngrupo_prueba = df&#91;df&#91;'grupo'] == 'B']&#91;'conversion']\n\nfrom scipy import stats\nt_test = stats.ttest_ind(grupo_control, grupo_prueba)\nprint(f\"Resultado t-test: {t_test.pvalue:.4f}\")<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>8. Exportaci\u00f3n de Resultados<\/strong><\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><strong>Guardado de Datos<\/strong><\/h3>\n\n\n\n<pre class=\"wp-block-code\"><code># Formato CSV\ndf.to_csv('datos_limpios.csv', index=False)\n\n# Formato Excel\nventas_por_categoria.to_excel('analisis_ventas.xlsx')\n\n# Formato Pickle (alto rendimiento)\ndf.to_pickle('datos_optimizados.pkl')<\/code><\/pre>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>9. Buenas Pr\u00e1cticas<\/strong><\/h2>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Documentaci\u00f3n<\/strong>: Comenta cada transformaci\u00f3n importante<\/li>\n\n\n\n<li><strong>Modularizaci\u00f3n<\/strong>: Crea funciones reutilizables<\/li>\n\n\n\n<li><strong>Validaci\u00f3n<\/strong>: Implementa checks de calidad de datos<\/li>\n\n\n\n<li><strong>Versionado<\/strong>: Guarda versiones limpias del dataset<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Checklist Final<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li>[ ] Exploraci\u00f3n inicial completa<\/li>\n\n\n\n<li>[ ] Datos limpios y normalizados<\/li>\n\n\n\n<li>[ ] Features relevantes creadas<\/li>\n\n\n\n<li>[ ] An\u00e1lisis clave realizados<\/li>\n\n\n\n<li>[ ] Visualizaciones efectivas<\/li>\n\n\n\n<li>[ ] Resultados exportados<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Recursos Adicionales<\/strong><\/h2>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/pandas.pydata.org\/docs\/\">Documentaci\u00f3n oficial de Pandas<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/pandas.pydata.org\/Pandas_Cheat_Sheet.pdf\">Pandas Cheat Sheet<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/pandas-dev\/pandas\/tree\/master\/doc\/cheatsheet\">Ejemplos avanzados<\/a><\/li>\n<\/ul>\n\n\n\n<p>\u00a1Ahora est\u00e1s listo para analizar datos de e-commerce como un profesional con Pandas! &#x1f680;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Introducci\u00f3n al An\u00e1lisis con Pandas En este tutorial pr\u00e1ctico, aprender\u00e1s a procesar y analizar datos de ventas e-commerce usando Pandas. Trabajaremos con un dataset real que contiene: 1. Configuraci\u00f3n Inicial Instalaci\u00f3n de Bibliotecas Importaci\u00f3n de M\u00f3dulos 2. Exploraci\u00f3n Inicial de Datos Carga del Dataset Primera Inspecci\u00f3n 3. Limpieza de Datos Profesional Manejo de Valores Faltantes [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":81,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[11],"tags":[20,5],"class_list":["post-80","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-programacion","tag-programacion","tag-python"],"_links":{"self":[{"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/80","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=80"}],"version-history":[{"count":1,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/80\/revisions"}],"predecessor-version":[{"id":82,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/80\/revisions\/82"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=\/wp\/v2\/media\/81"}],"wp:attachment":[{"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=80"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=80"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/vicfolio.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=80"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}