Posts tagged with protocol-buffers

I have a Framework 4.8 C# app that uses ClearScript to allow JavaScript to be used as an extension language. I am able to write plugins as DLLs and attach them at runtime, viz

JSE.Script.attach = (Func<string, bool>)Attach; ...         private static bool Attach(string dllPath, string name = "")         {             var status = false;             var htc = new HostTypeCollection();             try             {                 var assem = Assembly.Load(AssemblyName.GetAssemblyName(dllPath));                 htc.AddAssembly(assem);                 if (name.Length == 0)                 {                     name = assem.FullName.Split(',')[0];                 }                 JSE.AddHostObject(name, htc); //FIXME checkout the hosttypes                 Console.Error.WriteLine($"Attached {dllPath} as {name}");                 status = true;             }             catch (ReflectionTypeLoadException rtle)             {                 foreach (var item in rtle.LoaderExceptions)                 {                     Console.Error.WriteLine(item.Message);                     T.Fail(item.Message);                 }             }             catch (FileNotFoundException fnfe)             {                 Console.Error.WriteLine(fnfe.Message);                 T.Fail(fnfe.Message);             }             catch (Exception e)             {                 Console.Error.WriteLine(e.Message);                 T.Fail(e.Message);             }             return status;         } 

This permits my scripts to have lines like

attach(".\\Plugin_GoogleAds_Metrics.dll"); H = Plugin_GoogleAds_Metrics.GoogleAds_Metrics.Historical; H.EnableTrace("GAM"); ... 

I've made a public repo of the plugin for those interested.

What's not working in this situation is that when I try to execute the plugin's GetAccountInformation method, and execution reaches the GoogleAdsServiceClient googleAdsService = client.GetService(Services.V11.GoogleAdsService); line, an error is thrown complaining about Google.Protobuf, viz

Exception has been thrown by the target of an invocation.     at JScript global code (Script [23] [temp]:5:0) -> acc = H.GetAccountInformation(auths.Item1, 7273576109, true)    at Microsoft.ClearScript.ScriptEngine.ThrowScriptError(IScriptEngineException scriptError)    at Microsoft.ClearScript.Windows.WindowsScriptEngine.ThrowScriptError(Exception exception)    at Microsoft.ClearScript.Windows.WindowsScriptEngine.<>c__DisplayClass57_0`1.<ScriptInvoke>b__0()    at Microsoft.ClearScript.ScriptEngine.ScriptInvokeInternal[T](Func`1 func)    at Microsoft.ClearScript.ScriptEngine.ScriptInvoke[T](Func`1 func)    at Microsoft.ClearScript.Windows.WindowsScriptEngine.ScriptInvoke[T](Func`1 func)    at Microsoft.ClearScript.Windows.WindowsScriptEngine.Execute(UniqueDocumentInfo documentInfo, String code, Boolean evaluate)    at Microsoft.ClearScript.Windows.JScriptEngine.Execute(UniqueDocumentInfo documentInfo, String code, Boolean evaluate)    at Microsoft.ClearScript.ScriptEngine.Evaluate(UniqueDocumentInfo documentInfo, String code, Boolean marshalResult)    at Microsoft.ClearScript.ScriptEngine.Evaluate(DocumentInfo documentInfo, String code)    at Microsoft.ClearScript.ScriptEngine.Evaluate(String documentName, Boolean discard, String code)    at Microsoft.ClearScript.ScriptEngine.Evaluate(String documentName, String code)    at Microsoft.ClearScript.ScriptEngine.Evaluate(String code)    at RulesetRunner.Program.Run(JScriptEngine& jSE, String scriptText, Config cfg, Dictionary`2 settings) in C:\Users\bugma\Source\Repos\Present\BORR\RulesetRunner\RunManagementPartials.cs:line 72 Exception has been thrown by the target of an invocation. Exception has been thrown by the target of an invocation. Could not load file or assembly 'Google.Protobuf, Version=3.15.8.0, Culture=neutral, PublicKeyToken=a7d26565bac4d604' or one of its dependencies. The system cannot find the file specified. 

So

  1. I am using the latest Google.Ads.GoogleAds library
  2. AutoGenerateBindingRedirects has been set to true in the csproj file
  3. Add-BindingRedirect has been executed in the context of the Plugin's project
  4. The Plugin_GoogleAds_Metrics.dll is in the same folder as the Google.Protobuf.dll

Where to from here?

Migrating from AdWords to GoogleAds api.

Querying search_term_view:

val googleAdsClient: GoogleAdsClient = GoogleAdsClient.newBuilder()   .setCredentials(credential)   .setDeveloperToken(developerToken)   .setLoginCustomerId(loginCustomerId.toLong)   .build()     val svc = googleAdsClient.getLatestVersion.createGoogleAdsServiceClient() val query = s"""   SELECT      segments.keyword.info.text     ,search_term_view.search_term     ,segments.date   FROM search_term_view   WHERE segments.date BETWEEN '2022-01-01' AND '2022-01-01' """ svc.search(customerId, query).iteratePages().asScala.foreach { page =>   page.iterateAll().asScala.foreach { row =>     //row processing   } } 

The issue is that svc.search() skips rows, if one of columns is null. So getting results like

text1,term1 text2,term2 

While same request to Adwords api returns results like

text1,term1 text2,term2 --,term3 

Haven't found anything about nulls ignoring in docs.

Using latest google ads v10 lib: "com.google.api-ads" % "google-ads" % "17.0.1"

I'm wanting to convert the GoogleAdsRows to json (to then put into a dataframe). when using proto.Message.to_json, i do get json, but it also returns fields that I never queried.

here's the code i'm using (I declare the credentials right before, but left that out so i can post)

import proto from google.ads.googleads.client import GoogleAdsClient credentials = {     "developer_token": developer_token,     "refresh_token": refresh_token,     "client_id": client_id,     "client_secret": client_secret,     "login_customer_id": login_customer_id} client = GoogleAdsClient.load_from_dict(credentials) ga_service = client.get_service("GoogleAdsService",version='v6') query = """         SELECT              campaign.id,             segments.device                              FROM campaign          WHERE segments.date = '20210405'         LIMIT 10 """ response = ga_service.search_stream(customer_id=customer_id, query=query) for batch in response:     for row in batch.results:         newrow = proto.Message.to_json(row,preserving_proto_field_name=True)         print(newrow) 

returns (partial shown):

    "click_type": 0,     "conversion_action_category": 0,     "conversion_attribution_event_type": 0,     "conversion_lag_bucket": 0,     "conversion_or_adjustment_lag_bucket": 0,     "day_of_week": 0,     "external_conversion_source": 0,     "hotel_check_in_day_of_week": 0,     "hotel_date_selection_type": 0,     "hotel_rate_type": 0,     "hotel_price_bucket": 0,     "month_of_year": 0,     "placeholder_type": 0,     "product_channel": 0,     "product_channel_exclusivity": 0,     "product_condition": 0, 

so, I never ask for any of the fields above, only campaign.id and segments.device, yet it returns 36 fields... any idea on how to just return the fields I requested? If I print(row) directly, i can see that it only returns the fields requested in the query, so i have no idea where it is grabbing these extra fields from.

thanks!

Edit: I tinkered around with the response a bit more and now I have some decent results, however this seems very complex considering all i want to do is take protobuf -> DataFrame..

results = [] for batch in response:     for row in batch.results:         pbrow = proto.Message.pb(row)         newrow = json_format.MessageToJson(pbrow)         evalrow = eval(newrow)         df = pd.json_normalize(evalrow)         results.append(df) print(results) 

Output (made-up campaigns/customerids):

[                      campaign.resourceName campaign.id segments.device 0  customers/1234567891/campaigns/098765432   098765432         DESKTOP,                       campaign.resourceName campaign.id segments.device 0  customers/1234567891/campaigns/987654321   987654321          MOBILE,                       campaign.resourceName campaign.id segments.device 0  customers/1234567891/campaigns/876543210   876543210          TABLET,                       campaign.resourceName campaign.id segments.device 0  customers/1234567891/campaigns/765432109   765432109         DESKTOP,                       campaign.resourceName campaign.id segments.device ] 

is there any way to simplify this? what the heck am i missing that this needs 5 transformations to combine the data stream?

How do I use Google protocol buffers in a multiprocess script?

My use case is:

  • pulling data from the new Google Ads API
  • appending the objects with metadata
  • modelling using the objects
  • pushing the results to a database

AdWords Campaign Wrapper Object

I have an existing process for the old AdWords API, where I pull the data and store it in custom classes, e.g.

class Campaign(Represantable):     def __init__(self, id, managed_customer_id, base_campaign_id, name, status, serving_status):         self.id = id         self.managed_customer_id = managed_customer_id         self.base_campaign_id = base_campaign_id         self.name = name         self.status = status         self.serving_status = serving_status @classmethod     def from_zeep(cls, campaign, managed_customer_id):         return cls(             campaign.id,             managed_customer_id,             campaign.baseCampaignId,             campaign.name,             campaign.status,             campaign.servingStatus         ) 

Multiprocessing script

If I want to pull campaigns from a dozen accounts, I can run the scripts that populate the Campaign objects in parallel using pathos (again code simplified for this example):

import multiprocessing as mp from pathos.pools import ProcessPool class WithParallelism(object):     def __init__(self, parallelism_level):         self.parallelism_level = parallelism_level     def _parallel_apply(self, fn, collection, **kwargs):         pool = ProcessPool(             nodes=self.parallelism_level         )                  # this is to prevent Python from printing large traces when user interrupts execution (e.g. Ctrl+C)         def keyboard_interrupt_wrapper_fn(*args_wrapped):             try:                 return fn(*args_wrapped, **kwargs)             except KeyboardInterrupt:                 pass             except Exception as err:                 return err         errors = pool.map(keyboard_interrupt_wrapper_fn, collection)         return error 

Google Ads Campaign Wrapper Object

With the new API, I planned to store the protobuf object within my class, and use pointers to access the objects attributes. My class is a lot more complex than the example, using descriptors and subclass init for the attributes, but for simplicity it's effectively something like this:

class Campaign(Proto):     def __init__(self, **kwargs):         if "proto" in kwargs:             self._proto = kwargs['proto']         if "parent" in kwargs:             self._parent = kwargs['parent']         self._init_metadata(**kwargs) @property     def id(self):         return self._proto.id.value @property     def name(self):         return self._proto.name.value    ... 

This has the added advantage of being able to traverse the parent Google Ads object, to extract data from that protobuf object.

However, when I run my script of to populate these new objects in parallel, I get a pickle error. I understand that multiprocess uses pickle to serialize the objects, and one of the key advantages of protobuf objects is that they can be easily serialized.

How should I go about pulling the new Google Ads data in parallel:

  • Should I be serializing and deserializing the data in the Campaign object using SerializeToString
  • Should I just be extracting and storing the scalar data (id, name) like how I did with AdWords
  • Is there an entirely different approach?