# Parsing a very large array with list comprehension is slow

## Issue

I have an ultra large list of numerical values in `numpy.float64` format, and I want to convert each value, to `0.0` if there’s an `inf` value, and parse the rest the elements to simple float.

This is my code, which works perfectly:

``````# Values in numpy.float64 format.
original_values = [np.float64("Inf"), np.float64(0.02345), np.float64(0.2334)]

# Convert them
parsed_values = [0.0 if x == float("inf") else float(x) for x in original_values]
``````

But this is slow. Is there any way to faster this code? Using any magic with `map` or `numpy` (I have no experience with these libraries)?

## Solution

Hey~ you probably are asking how could you do it faster with numpy, the quick answer is to turn the list into a numpy array and do it the numpy way:

``````import numpy as np

original_values = [np.float64("Inf"), ..., np.float64(0.2334)]
arr = np.array(original_values)
arr[arr == np.inf] = 0
``````

where `arr == np.inf` returns another array that looks like `array([ True, ..., False])` and can be used to select indices in `arr` in the way I showed.

Hope it helps.

I tested a bit, and it should be fast enough:

``````# Create a huge array
arr = np.random.random(1000000000)
idx = np.random.randint(0, high=1000000000, size=1000000)
arr[idx] = np.inf

# Time the replacement
def replace_inf_with_0(arr=arr):
arr[arr == np.inf] = 0

timeit.Timer(replace_inf_with_0).timeit(number=1)
``````

The output says it takes 1.5 seconds to turn all 1,000,000 `inf`s into `0`s in a 1,000,000,000-element array.

@Avión used `arr.tolist()` in the end to convert it back to a list for MongoDB, which should be the common way. I tried with the billion-sized array, and the conversion took about 30 seconds, while creating the billion-sized array took less than 10 sec. So, feel free to recommend more efficient methods. 