Plotly: How can I set marker size based on column value?

Issue

I am trying to use plotly (version 4.6.0) to create plots, but having trouble with the markers/size attribute.

I am using the Boston housing price dataset in my example. I want to use the value in one of the columns of my dataframe to set a variable size for the marker, but I get an error when I use a direct reference to the column (size='TAX'). I can set the size to a constant (size=1) without issues.

I found some examples online, but they generate a ValueError when I try to use them. How can I avoid this error? Code and error are shown below.

    import chart_studio.plotly as py
    import plotly.graph_objs as go
    from plotly.offline import iplot, init_notebook_mode
    import cufflinks
    cufflinks.go_offline(connected=True)
    init_notebook_mode(connected=True)
    import pandas as pd
    from sklearn.datasets import load_boston
    
    boston = load_boston()
    df = pd.DataFrame(boston.data, columns=boston.feature_names)
    y = boston.target
    df['RAD_CAT']=df['RAD'].astype(str)
    
    df.iplot(
        x='CRIM',
        y='INDUS',
        size='TAX',
        #size=1,
        text='RAD',
        mode='markers',
        layout=dict(
            xaxis=dict(type='log', title='CRIM'),
            yaxis=dict(title='INDUS'),
            title='CRIM vs INDUS Sized by RAD'))
    
    ValueError:  
        Invalid value of type 'builtins.str' received for the 'size' property of scatter.marker  
            Received value: 'TAX'  
    
        The 'size' property is a number and may be specified as:  
          - An int or float in the interval [0, inf]  
          - A tuple, list, or one-dimensional numpy array of the above  

Solution

import chart_studio.plotly as py
import plotly.graph_objs as go
from plotly.offline import iplot, init_notebook_mode
import cufflinks

cufflinks.go_offline(connected=True)
init_notebook_mode(connected=True)
import pandas as pd
from sklearn.datasets import load_boston

boston = load_boston()
df = pd.DataFrame(boston.data, columns=boston.feature_names)

df.iplot(
    x='CRIM',
    y='INDUS',
    size=df['TAX']/20, 
    text='RAD',
    mode='markers',
    layout=dict(
        xaxis=dict(type='log', title='CRIM'),
        yaxis=dict(title='INDUS'),
        title='CRIM vs INDUS Sized by TAX'))

Answered By – Flavia Giammarino

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published