Floating Point issues in percentiles

For numeric variables with few numeric values.. say 2-3, the percentile values will be same across a lot of points. Hence, the condition in np.where(arr > arr[start]) might break and return the wrong lowest percentile, causing the program to be stuck in the while loop.

```
def get_next_range(arr,group_range,start):
    if group_range + start >=100:
        return 100
    elif (100 - group_range/2) < start + group_range:
        return 100
    elif arr[-1] == arr[start]:
        return 100
    elif (arr[start+group_range] == arr[start]) or (arr[start] < 0):
        return np.max([np.min(np.where(arr > arr[start])),np.min(np.where(arr >= 0))])
    else:
        return group_range + start
```

For rectification of this error, percentile values after calculation must be rounded off to some fixed decimal values
Something like the following
`percentiles = np.around(np.array([np.percentile(df1[var],p) for p in range(0,100)]), decimals = 5)`
will fix this issue


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Floating Point issues in percentiles #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Floating Point issues in percentiles #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions