GoogleAds search_stream to json
I'm wanting to convert the GoogleAdsRows to json (to then put into a dataframe). when using proto.Message.to_json, i do get json, but it also returns fields that I never queried.
here's the code i'm using (I declare the credentials right before, but left that out so i can post)
import proto from google.ads.googleads.client import GoogleAdsClient credentials = { "developer_token": developer_token, "refresh_token": refresh_token, "client_id": client_id, "client_secret": client_secret, "login_customer_id": login_customer_id} client = GoogleAdsClient.load_from_dict(credentials) ga_service = client.get_service("GoogleAdsService",version='v6') query = """ SELECT campaign.id, segments.device FROM campaign WHERE segments.date = '20210405' LIMIT 10 """ response = ga_service.search_stream(customer_id=customer_id, query=query) for batch in response: for row in batch.results: newrow = proto.Message.to_json(row,preserving_proto_field_name=True) print(newrow)
returns (partial shown):
"click_type": 0, "conversion_action_category": 0, "conversion_attribution_event_type": 0, "conversion_lag_bucket": 0, "conversion_or_adjustment_lag_bucket": 0, "day_of_week": 0, "external_conversion_source": 0, "hotel_check_in_day_of_week": 0, "hotel_date_selection_type": 0, "hotel_rate_type": 0, "hotel_price_bucket": 0, "month_of_year": 0, "placeholder_type": 0, "product_channel": 0, "product_channel_exclusivity": 0, "product_condition": 0,
so, I never ask for any of the fields above, only campaign.id and segments.device, yet it returns 36 fields... any idea on how to just return the fields I requested? If I print(row) directly, i can see that it only returns the fields requested in the query, so i have no idea where it is grabbing these extra fields from.
thanks!
Edit: I tinkered around with the response a bit more and now I have some decent results, however this seems very complex considering all i want to do is take protobuf -> DataFrame..
results = [] for batch in response: for row in batch.results: pbrow = proto.Message.pb(row) newrow = json_format.MessageToJson(pbrow) evalrow = eval(newrow) df = pd.json_normalize(evalrow) results.append(df) print(results)
Output (made-up campaigns/customerids):
[ campaign.resourceName campaign.id segments.device 0 customers/1234567891/campaigns/098765432 098765432 DESKTOP, campaign.resourceName campaign.id segments.device 0 customers/1234567891/campaigns/987654321 987654321 MOBILE, campaign.resourceName campaign.id segments.device 0 customers/1234567891/campaigns/876543210 876543210 TABLET, campaign.resourceName campaign.id segments.device 0 customers/1234567891/campaigns/765432109 765432109 DESKTOP, campaign.resourceName campaign.id segments.device ]
is there any way to simplify this? what the heck am i missing that this needs 5 transformations to combine the data stream?
Well, this has changed. Not sure if the solution was present at the time of the original comments. Here's the solution though:
for batch in stream: for row in batch.results: newrow = proto.Message.to_json( row, preserving_proto_field_name=True, use_integers_for_enums=False, including_default_value_fields=False ) print(newrow)Effectively, I added use_integers_for_enums=False and including_default_value_fields=False. These are True by default.
This would go in place of your following code:
for batch in response: for row in batch.results: newrow = proto.Message.to_json(row,preserving_proto_field_name=True) print(newrow)To save you some time, here's a copy of what I'm using:
import proto import json from google.ads.googleads.client import GoogleAdsClient from google.ads.googleads.errors import GoogleAdsException def main(client, customer_id): ga_service = client.get_service("GoogleAdsService") query = """ SELECT campaign.id, campaign.name FROM campaign ORDER BY campaign.id""" # Issues a search request using streaming. stream = ga_service.search_stream(customer_id=customer_id, query=query) data = list() for batch in stream: for row in batch.results: newrow = proto.Message.to_json( row, preserving_proto_field_name=True, use_integers_for_enums=False, including_default_value_fields=False ) data.append(json.loads(newrow)) return data