Skip to main content

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [List Home]
Re: [che-dev] How to collect and persist all workspace logs?

Hi all,

Just some thoughts about unified workspace logging, the ongoing work on the Workspace CRD, and cloud shell.

According to this EPIC https://github.com/eclipse/che/issues/15425,
there will be, at some point, the ability to start Che 7 workspaces in a lightweight,
standalone, and embeddable way, without requiring the presence of the Che master.

One important point mentioned in this EPIC, is the big scalability gain that would be brought,
in this envisioned K8S-native architecture, by removing the Postgres database in favor of K8S-native Custom Resources
that will benefit from the highly-scalable etcd storage underpinning K8S clusters.

It seems to me that these 2 points:
- Speak *against*:
  - Using the wsmaster server or some sort of required centralized component, either to store the logs, or to collect them.
    In this regard using the Postgres database seems the worst choice according to the future architectural directions
- Speak *in favor of*:
  - Collecting the logs locally in the workspace and sending them from the workspace POD to a logging mechanism,
    hopefully in a way that would be compatible with typical K8S logging infrastructures.

David.

Le jeu. 23 janv. 2020 à 05:09, Michal Vala <mvala@xxxxxxxxxx> a écrit :
Hello team,

we're currently working on improving diagnosis capabilities[1] of workspaces, to
be more concrete, how to get all logs from the workspace[2]. We're in phase of
investigating options and prototyping and we've came up with several variants
how to achieve the goal. We would like to know your opinion and new ideas.

Requirements:
  - collect all logs of all containers from the workspace
  - stdout/err as well as file logs inside the container
  - keep history of last 5 runs of the workspace
  - collect logs of crashed workspace
  - make logs easily accessible to the user (rest API + dashboard view)


I've splitted the effort into two sections:

  ### How to collect:

    # log everything to files to mounted PV
      - just mount PV and log everything there
      - pros
        - not much extra overhead, only write stdout/err to the file
and mount PV
        - don't need extra hw resources (memory/cpu)
      - cons
        - we might need to override `command` of all containers. They will
          have to run with extra parameters to write stdout/err to the file.
          Something like `<command> 2>&1 | tee ws.log`

    # workspace collector sidecar (kubernetes/client-go app?)
      - pros
        - per workspace
        - dynamic and powerful
      - cons
        - very custom solution and might be hard to manage/maintain
        - unknown performance and hw resources requirements
        - hard when ws crash
        - need more memory per workspace, even if user does not use it and
          everything works as expected

    # watch and collect from master
      - pros
        - easy to grab logs and events
        - easy to access archived logs
      - cons
        - only container's stderr/out
        - keep the connection to ws
        - more network traffic
        - increase memory footprint of mastaer

    # kubernetes native
      - change the logging backend of kubernetes [3]
      - pros
        - standard k8s way, "googleable"
      - cons
        - depends on kubernetes deployment
        - needs extra cluster component/configuration
        - only stdout/err of containers

    # push logs directly from containers to logging backend
      - cons
        - customize all components to log to the backend
        - performance and hw resources overhead

    # collect on workspace exit
      - mount PV and log there. When workspace exits, start collector pod that
          grabs the logs and "archive" them.
      - pros
        - not much extra overhead
      - cons
        - don't have logs of running workspace
        - custom collector pod


  ### Where to store and how to access:

    # Workspace PV
      - pros
        - easy to set quota per user
      - cons
        - harder to access (need to start some pod at workspace's namespace)
        - lost when delete namespace

    # Che PV
      - pros
        - easier to access
      - cons
        - harder to set quota per user
        - harder to scale and manage
        - possible performance bottleneck

    # PostgreSQL
      - pros
        - the easiest to access
      - cons
        - harder to set quota per user
        - harder to scale and manage
        - possible performance bottleneck


There is one remaining and very important question we have not investigated
much. We need to somehow configure all plugins/editors and other components, to
tell where they have all log files that should be collected. Otherwise, we
would not be able to find the logs on containers. We would need to
handle that in
plugin's `meta.yaml` as well as in the devfile.

What's next?
  We would like to investigate and prototype following solution:
    - collect all ws logs into files and store in PV in the workspace
    - watch ws events from master and on exit, start the collector pod that will
      collect all the logs and pass them to the backend. Logs backend
is something
      to be done. It might be only PV dedicated to archiving log, or some new
      service, or Che master.
    - prototype new Che master API to access the logs. If we store
them in workspace's PV,
      start the collector pod on demand to access the logs.


We would very much welcome any opinions or ideas.


[1] - https://github.com/eclipse/che/issues/15047
[2] - https://github.com/eclipse/che/issues/15134
[3] - https://kubernetes.io/docs/concepts/cluster-administration/logging/#sidecar-container-with-a-logging-agent

_______________________________________________
che-dev mailing list
che-dev@xxxxxxxxxxx
To change your delivery options, retrieve your password, or unsubscribe from this list, visit
https://www.eclipse.org/mailman/listinfo/che-dev



--

David Festal

Principal Software Engineer, DevTools

Red Hat France

dfestal@xxxxxxxxxx  



Back to the top