decouple product data from runtime data and code

Product data is preformatted into redis and Elasticsearch. There should be a way to decouple the generation of this assets from the runtime data (e.g. orders, customers). As well this should be decoupled from code. 
Now, introducing breaking changes to the structure of redis or search require either downtime due the process of recalculating or manually hack around with different zed instances and redis instances or  search indices (e.g. via alias)

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Sep 18 2018
  • In review
How we can make you more productive?

Provide a decent way to decouple product data from esp. code 

Company Winterhalter-Fenner AG
  • Attach files
  • Admin
    Dmitriy Mikhailov commented
    26 Sep 10:33

    Hi Georg,

    I want to make sure that I understand your problem. Are you talking about "data" field in *_search and *_storage DB tables (Zed side) which are used to propagate ElasticSearch and Redis (Yves side) with the data?
    Could you please explain what you are trying to achieve by decoupling runtime data and how the ideal solution might look like for you?

    Thanks a lot.
    - Dmitriy

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    26 Sep 12:33

    Hi Dmitriy

    This is perhaps a bit verbose as a comment here - but i try to keep it short. What we want to achieve is a) the possibility to deploy code without hours of downtime and b) to be able to apply new sets of data without reasonable downtime (which we export out of our external PIM system on a daily basis and import into the database).

    At the moment we are not able to deploy code without rerunning the whole P&S-cycle if there is any change in the code which relates to search or redis data formats without downtime during the process. One workaround to fix this would be to configure different redis dbs and search indices for the new dataset and couple one codebase with one set of redis/index. This means when switching Yves the used redis/search index would also switch. 
    Importing a new set of data points in the same direction. it would be nice to be able to prepare new sets of data and apply the changes all at once. This would mean the data is consistent all of the time. If we would apply the changes one by one this could lead to not well defined states in redis and search. This is (as i wrote) less important the buggy code (buggy when applied to a specific version of the data), but also not what we want.

    For me a perfect solution takes care of this versioning problem. There are different approaches for different technologies used. Database-wise this problem is handled (at least for non-breaking, structural changes) by propel (diff and migrate). Elasticsearch suggests using index aliases, redis has a command called SWAPDB to achieve similar behaviour. But in general i would like this to be "implementation detail" which i don't need to know because spryker has some abstraction on top of this to handle the issue. 

    At the core of the problem is a) we don't use spryker to manage data (and the duration if a full data import / update) and b) the performance of P&S . The pain of the downtime needed is in a direct relationship to the duration of theses processes.

    I hope this was helpful. 


  • Admin
    Ehsan Zanjani commented
    27 Sep 17:23

    Hi Georg
    Correct me if I'm wrong, You need a feature which sends storage and search documents to multiple Redis\ES (Dbs\Indexs) at same time and then you want to update one of the Redis\ES instance via your PIM during  running system and have some ability (like console command) to switch instance of Redis\ES, right?

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    28 Sep 05:38

    Hi Ehsan

    From Yves perspective this would only be switching the dbs (let's talk about Postgres, Redis and ES as DBs for this matter, as it's not the core of the problem which technology, or even the same technology is used), correct.
    On Zed side perhaps this would mean to run 2 different instances, one serving the customer traffic and another (which may have a new codebase) to build up the new product data. This instance does not need to serve customer traffic meanwhile. 

    And then it would be nice to somehow be able to coordinate the switch to a new codebase on yves instances and the customer serving zed (if so) and the new dataset. if this is somehow possible with a single console command: perfekt. If it needs some, run one after another on different systems: still no big deal. On the other hand, if this would mean fiddeling with config values manually: error-prone and painful...

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    23 Oct 09:13



    Just to drop this as a note and to get people on the same page here:

    The described concept is widely known as blue-green-deployment ( With more moving parts involved this is a little more complex in the Spryker case (e.g. first updating Zed, or rather some Zed nodes, filling stores in the background with the new structure, and then atomically updating Yves and switching datastores, and all of this in a rolling kind of matter).

    This should already be possible as some rather complex DevOps-y topic.