data-analysis

In the IterativeImputer, min_value and max_value are defaulted to None. Internally, if they are None min and max value will be affected to -np.inf and np.inf, respectively.

We should change this behaviour and make that the default of min_value=-np.inf and max_value=np.inf directly.

Code Sample, a copy-pastable example if possible

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2, 3, 4, 5], 'B': [7, 5, np.nan, 3, 2]})
df.plot(x='A', y='B')
df = df.astype('Int64')
df.plot(x='A', y='B')

Problem description

The first plotting command works, the second throws the error message

TypeError: float() argument must be a string o

When adding a new data store database in Metabase it tries to connect first with SSL (if the driver supports SSL), and then without, in that order. If either connection succeeds, the database details are accepted as valid. Yes, you can add driver specific JDBC options to use SSL, but there is no good feed

Hi there

I'm trying to parse this king of lines, from a python flask service whose log format is %(asctime)s [%(process)d] (%(levelname)s) (%(name)s): %(message)s

2020-02-10 13:58:38,594 [31383] (INFO) (flask.app): request: OPTIONS https://server_hostname/0.1/token/a_big_uuid {'Host': 'server_hostname', 'X-Script-Name': '/api/auth', 'X-Forwarded-For': 'an_IP_address', 'Connection': 'c

Is your feature request related to a problem? Please describe.
When working with a big piece of text, I sometimes scroll down and copy some text into another tab. When switching back to the first tab, both the input and the output pane is back on top. So I don't know where I was working just now.

Describe the solution you'd like
After tab switching, scroll position should be remembere

Link to doc page in question (if any):

https://docs.streamlit.io/cli.html#view-all-config-options

Name of the Streamlit feature whose docs need improvement:

The documentation for configuring Streamlit through environment variables incorrectly states that the prefix is STREAMLIT_CONFIG_. Mentioned in streamlit/streamlit#477 (comment).

**Wha

Describe the bug
I'm trying to export with « Custom tabular exporter… » and choosing « Upload to -> A new Google spreadsheet », and in some rows, if a value is missing, then the cell value disapear and the cell on the right shift left.

To Reproduce
Steps to reproduce the behavior:

« Custom tabular exporter… »
choose « Upload to -> A new Google spreadsheet »
See the result

Need to do some better handling of low-observation models in plot_diagnostics. These are models that shouldn't really be estimated, and we can't really make the plots work, but we shouldn't raise exceptions.

Any dataset with less than 10 observations will raise an error computing the error autocorrelations:

mod = sm.tsa.statespace.SARIMAX(np.random.normal(size=10), order=(10,

Missing functionality

On the back of the issue raised - pandas-profiling/pandas-profiling#315 - I would like to request for improved documentation about removed features and alternative ways to overcome them when using the new version of pandas-profiling, starting v2.4.0.

Proposed feature

docs on why style={'full_width': True}, minify_html=True ca

Page
https://docs.alluxio.io/os/user/edge/en/deploy/Running-Alluxio-On-Kubernetes.html

Summary
The K8s deployment instructions is missing documentation for:

Configuring tiered storage w/ helm chart
Deploying the fuse daemonset using the helm chart

The ask for this issue is to

create an "Example: Tiered Storage" under the section https://docs.alluxio.io/os/user/edge/en/de

Tooltips help to explain concepts better. We can implement one of the mkdocs plugins that add this functionality:

https://github.com/midnightprioriem/mkdocs-tooltipster-links-plugin
https://github.com/lsaether/md-tooltips

Description

Calling fit method of Pipeline object throws an expection: UnboundLocalError: local variable 'cloned_transformer' referenced before assignment, when the memory argument is passed an argument. Therfore I am unable to cache any transformers (especially during hyperparameter tuning using a Pipeline object.

Steps/Code to Reproduce

Example:

from imblearn.pipel

I've been trying out knowledge-repo and came across multiple problems that seems to have resulted from following the old documentation on knowledge-repo.readthedocs.io. Looking through the issues it seems that several PRs modified the docs, but the changes didn't propagate yet to the documentation website. It might be a good idea to synchronize the two, so that new users can have a lower entrance

See:
gonum/gonum#845 (comment)

How can i implement callback parameter in fit moder Autoencoder ?
There is not parameter.

from keras.callbacks.callbacks import EarlyStopping
cb_earlystop = EarlyStopping(monitor='val_loss', min_delta=0, patience=0, verbose=0,
mode='auto', baseline=None, restore_best_weights=False)
pyod_model.fit(scaler, callbacks=[cb_earlystop])

TypeError: fi

A lot has been added to SQLPad recently and I think the README is getting a bit long.

I'm thinking about switching the project page to use a documentation generator tool instead of Hugo, particularly docsify. It seems nice and simple and runs mainly on markdown.

Following #68 that would be very nice to have extra information in the model. Maybe I could find it but I do see that in the documentation:

Recovering the early stopping epoch number
Recovering the cross validation test and train losses / metrics
Recovering eval_set losses / metrics
Be able to pass a path to a libffm data format

the space between two plots in a facet plot is not large enough for the tick values. Hence the values overlap with the left adjacent figure (please see screenshot).

I use the command:
p9.facet_wrap(facets = 'currency', nrow=2, scales='free_y', shrink=True)
with plotnine 0.6 w

Use mvn release:prepare to build the docs and copy them to the /docs directory

I suggest keeping the README short and to the point: badges showing status etc. of the package, what is the purpose of the package, how to install, 1-2 basic code examples, and contributing information.

More examples, function API docs, and detailed descriptions can be moved to the docs site.

I also suggest adding a gallery to the docs, [similar to Seaborn](https://seaborn.pydata.org/example

I'm not sure how feasible this may be, but the old logparser program from Microsoft supported queries like

Select * from test.csv

And it would read from test.csv, without needing any config files.

First off, thanks so much for tad; All in all I'm loving it as a light weight CSV viewer!

The only issue I've had so far is that someone passed me a semi-colon delimited file with some decimals, and strangely, these numbers load as ints unless I replace the semi-colons with commas. Any thoughts what might be going on here?

Thanks again!

Feb	MAR	Apr
	04
2019	2020	2021

data-analysis

Here are 5,338 public repositories matching this topic...

scikit-learn / scikit-learn

pandas-dev / pandas

Code Sample, a copy-pastable example if possible

Problem description

metabase / metabase

allinurl / goaccess

gchq / CyberChef

streamlit / streamlit

OpenRefine / OpenRefine

Yorko / mlcourse.ai

statsmodels / statsmodels

rhiever / Data-Analysis-and-Machine-Learning-Projects

pandas-profiling / pandas-profiling

Alluxio / alluxio

pachyderm / pachyderm

scikit-learn-contrib / imbalanced-learn

Description

Steps/Code to Reproduce

SpiderClub / weibospider

airbnb / knowledge-repo

guipsamora / pandas_exercises

qinwf / awesome-R

gonum / gonum

BrambleXu / pydata-notebook

yzhao062 / pyod

rickbergfalk / sqlpad

aksnzhy / xlearn

r0f1 / datascience

has2k1 / plotnine

jtablesaw / tablesaw

ResidentMario / missingno

cube2222 / octosql

antonycourtney / tad

justmarkham / DAT8

Improve this page

Add this topic to your repo