decouple product data from runtime data and code

Product data is preformatted into redis and Elasticsearch. There should be a way to decouple the generation of this assets from the runtime data (e.g. orders, customers). As well this should be decoupled from code. 
Now, introducing breaking changes to the structure of redis or search require either downtime due the process of recalculating or manually hack around with different zed instances and redis instances or  search indices (e.g. via alias)

  • Avatar32.5fb70cce7410889e661286fd7f1897de Guest
  • Sep 18 2018
  • In review
  • Attach files
  • Admin
    Dmytro Mykhailov commented
    September 26, 2018 10:33

    Hi Georg,

    I want to make sure that I understand your problem. Are you talking about "data" field in *_search and *_storage DB tables (Zed side) which are used to propagate ElasticSearch and Redis (Yves side) with the data?
    Could you please explain what you are trying to achieve by decoupling runtime data and how the ideal solution might look like for you?

    Thanks a lot.
    - Dmitriy

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 26, 2018 12:33

    Hi Dmitriy

    This is perhaps a bit verbose as a comment here - but i try to keep it short. What we want to achieve is a) the possibility to deploy code without hours of downtime and b) to be able to apply new sets of data without reasonable downtime (which we export out of our external PIM system on a daily basis and import into the database).

    At the moment we are not able to deploy code without rerunning the whole P&S-cycle if there is any change in the code which relates to search or redis data formats without downtime during the process. One workaround to fix this would be to configure different redis dbs and search indices for the new dataset and couple one codebase with one set of redis/index. This means when switching Yves the used redis/search index would also switch. 
    Importing a new set of data points in the same direction. it would be nice to be able to prepare new sets of data and apply the changes all at once. This would mean the data is consistent all of the time. If we would apply the changes one by one this could lead to not well defined states in redis and search. This is (as i wrote) less important the buggy code (buggy when applied to a specific version of the data), but also not what we want.

    For me a perfect solution takes care of this versioning problem. There are different approaches for different technologies used. Database-wise this problem is handled (at least for non-breaking, structural changes) by propel (diff and migrate). Elasticsearch suggests using index aliases, redis has a command called SWAPDB to achieve similar behaviour. But in general i would like this to be "implementation detail" which i don't need to know because spryker has some abstraction on top of this to handle the issue. 

    At the core of the problem is a) we don't use spryker to manage data (and the duration if a full data import / update) and b) the performance of P&S . The pain of the downtime needed is in a direct relationship to the duration of theses processes.

    I hope this was helpful. 


  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 27, 2018 17:23

    Hi Georg
    Correct me if I'm wrong, You need a feature which sends storage and search documents to multiple Redis\ES (Dbs\Indexs) at same time and then you want to update one of the Redis\ES instance via your PIM during  running system and have some ability (like console command) to switch instance of Redis\ES, right?

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    September 28, 2018 05:38

    Hi Ehsan

    From Yves perspective this would only be switching the dbs (let's talk about Postgres, Redis and ES as DBs for this matter, as it's not the core of the problem which technology, or even the same technology is used), correct.
    On Zed side perhaps this would mean to run 2 different instances, one serving the customer traffic and another (which may have a new codebase) to build up the new product data. This instance does not need to serve customer traffic meanwhile. 

    And then it would be nice to somehow be able to coordinate the switch to a new codebase on yves instances and the customer serving zed (if so) and the new dataset. if this is somehow possible with a single console command: perfekt. If it needs some, run one after another on different systems: still no big deal. On the other hand, if this would mean fiddeling with config values manually: error-prone and painful...

  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    October 23, 2018 09:13



    Just to drop this as a note and to get people on the same page here:

    The described concept is widely known as blue-green-deployment ( With more moving parts involved this is a little more complex in the Spryker case (e.g. first updating Zed, or rather some Zed nodes, filling stores in the background with the new structure, and then atomically updating Yves and switching datastores, and all of this in a rolling kind of matter).

    This should already be possible as some rather complex DevOps-y topic.



  • Avatar40.8f183f721a2c86cd98fddbbe6dc46ec9
    Guest commented
    08 Jan 10:57

    Short update: We've built it - on a bash / configuration basis via dynamically written php files which get included in the config files. If anyone is interested in out solution feel free to contact me.